Numerics |
Similar to chlit, strlit etc. numeric parsers are also primitives. Numeric parsers are placed on a section of its own so as to provide a better focus on this important building block. The framework includes a couple of predefined objects for parsing signed and unsigned integers and real numbers. These parsers are fully parametric. Most of the important aspects of numeric parsing can be finely adjusted to suit. This include the radix base, the minimum and maximum number of allowable digits, the exponent, the fraction etc. Policies control the real number parsers' behavior. There are some predefined policies covering the most common real number formats but the user can supply her own when needed.
This class is the simplest among the members of the numerics package. The uint_parser can parse unsigned integers of arbitrary length and size. The uint_parser parser can be used to parse ordinary primitive C/C++ integers or even user defined scalars such as bigints (unlimited precision integers). Like most of the classes in Spirit, the uint_parser is a template class. Template parameters fine tune its behavior. The uint_parser is so flexible that the other numeric parsers are implemented using it as the backbone.
template
<
typename T = unsigned,
int Radix = 10,
unsigned MinDigits = 1,
int MaxDigits = -1
>
struct uint_parser { /*...*/ };
uint_parser template parameters | |
T | The underlying type of the numeric parser. Defaults to unsigned int |
Radix | The radix base. This can be either 2: binary, 8: octal, 10: decimal and 16: hexadecimal. Defaults to 10; decimal |
MinDigits | The minimum number of digits allowable |
MaxDigits | The maximum number of digits allowable. If this is -1, then the maximum limit becomes unbounded |
Predefined uint_parsers | |
bin_p | uint_parser<unsigned,
2, 1,
-1> const |
oct_p | uint_parser<unsigned,
8, 1,
-1> const |
uint_p | uint_parser<unsigned,
10, 1,
-1> const |
hex_p | uint_parser<unsigned,
16, 1,
-1> const |
The following example shows how the uint_parser can be used to parse thousand separated numbers. The example can correctly parse numbers such as 1,234,567,890.
uint_parser<unsigned, 10, 1, 3> uint3_p; // 1..3 digits uint_parser<unsigned, 10, 3, 3> uint3_3_p; // exactly 3 digits ts_num_p = (uint3_p >> *(',' >> uint3_3_p)); // our thousand separated number parser
bin_p, oct_p, uint_p and hex_p are parser generator objects designed to be used within expressions. Here's an example of a rule that parses comma delimited list of numbers (We've seen this before):
list_of_numbers = real_p >> *(',' >> real_p)
;
Later, we shall see how we can extract the actual numbers parsed by the numeric
parsers. We shall deal with this when we get to the section on specialized
actions.
The int_parser can parse signed integers of arbitrary length and size.
This is almost the same as the uint_parser. The only difference is
the additional task of parsing the '+'
or '-' sign preceding the number. The class interface
is the same as that of the uint_parser.
A predefined int_parser | |
int_p | int_parser <int,
10, 1,
-1> const |
The real_parser can parse real numbers of arbitrary length and size limited by its underlying parametric type T. The real_parser is a template class with 2 template parameters. Here's the real_parser template interface:
template < typename T = double, typename RealPoliciesT = ureal_parser_policies<T> > struct real_parser;
The first template parameter is its underlying type T. This defaults to double.
Parsing special numeric types Notice that T can be specified by the user. This is the underlying data type of the parser. This implies that we can use the numeric parsers to parse user defined numeric types such as fixed_point (fixed point reals) and bigint (unlimited precision integers). |
The second template parameter are the policies grouped in a class and defaults
to ureal_parser_policies<T>. As already mentioned, policies control
the real number parsers' behavior. The default policies are provided to take
care of the most common case (there are many ways to represent, and hence parse,
real numbers). In most cases, the default setting of the real_parser
is sufficient and can be used straight out of the box. Actually, there are two
real_parsers pre-defined for immediate use:
Predefined real_parsers | |
ureal_p | real_parser <double,
ureal_parser_policies<double>
> const |
real_p | real_parser <double,
real_parser_policies<double>
> const |
We've seen real_p before. ureal_p is its unsigned variant.
The default policies provided are designed to parse C/C++ style real numbers of the form nnn.fff.Eeee where nnn is the whole number part, fff is the fractional part, E is 'e' or 'E'and eee is the exponent optionally preceded by '-' or '+'. This corresponds to the following grammar:
floatingliteral
= fractionalconstant >> !exponentpart
| +digit_p >> exponentpart
;
fractionalconstant
= *digit_p >> '.' >> +digit_p
| +digit_p >> '.'
;
exponentpart
= ('e' | 'E') >> !('+' | '-') >> +digit_p
;
The parser policies break down real number parsing into 6 steps:
1 | parse_sign | Parse the prefix sign |
2 | parse_n | Parse the integer at the left of the decimal point |
3 | parse_dot | Parse the decimal point |
4 | parse_frac_n | Parse the fraction after the decimal point |
5 | parse_exp | Parse the exponent prefix (e.g. 'e') |
6 | parse_exp_n | Parse the actual exponent |
[ From here on, required reading: The Scanner, In-depth The Parser and In-depth The Scanner ]
Before we move on, a small utility parser is included here to ease the parsing of the '-' or '+' sign. While it is easy to write one:
sign_p = (ch_p('+') | '-');
it is not possible to extract the actual sign (positive or negative) without resorting to semantic actions. The sign_p parser has a bool attribute returned back to the caller through the match object which, after parsing, is set to true if the parsed sign is negative. This attribute can be used to detect if the negative sign has been parsed . Examples:
bool is_negative; r = sign_p[assign(is_negative)];
or simply...
// directly extract the result from the match result's value bool is_negative = sign_p.parse(scan).value();
The sign_p parser expects attached semantic actions to have a signature (see Specialized Actions for further detail) compatible with:
Signature for functions:
void func(bool is_negative);
Signature for functors:
struct ftor
{
void operator()(bool is_negative) const;
};
template <typename T> struct ureal_parser_policies { typedef uint_parser<T, 10, 1, -1> uint_parser_t; typedef int_parser<T, 10, 1, -1> int_parser_t; template <typename ScannerT> static typename match_result<ScannerT, nil_t>::type parse_sign(ScannerT& scan) { return scan.no_match(); } template <typename ScannerT> static typename parser_result<uint_parser_t, ScannerT>::type parse_n(ScannerT& scan) { return uint_parser_t().parse(scan); } template <typename ScannerT> static typename parser_result<chlit<>, ScannerT>::type parse_dot(ScannerT& scan) { return ch_p('.').parse(scan); } template <typename ScannerT> static typename parser_result<uint_parser_t, ScannerT>::type parse_frac_n(ScannerT& scan) { return uint_parser_t().parse(scan); } template <typename ScannerT> static typename parser_result<chlit<>, ScannerT>::type parse_exp(ScannerT& scan) { return nocase_d['e'].parse(scan); } template <typename ScannerT> static typename parser_result<int_parser_t, ScannerT>::type parse_exp_n(ScannerT& scan) { return int_parser_t().parse(scan); } };
The default ureal_parser_policies uses the lower level integer numeric parsers to do its job.
template <typename T> struct real_parser_policies : public ureal_parser_policies<T> { template <typename ScannerT> static typename parser_result<sign_parser, ScannerT>::type parse_sign(ScannerT& scan) { return sign_p.parse(scan); } };
Notice how the real_parser_policies has replaced parse_sign of the ureal_parser_policies from which it is subclassed from. The default real_parser_policies simply uses a sign_p instead of scan.no_match() in the parse_sign step.
Other "specialized" real parser policies can reuse these defaults. One or more of these policies may be replaced by the client. For example, here's a real number parser that parses thousands separated numbers with at most two decimal places and no exponent:
template <typename T> struct ts_real_parser_policies : public ureal_parser_policies<T> { // These policies can be used to parse thousand separated // numbers with at most 2 decimal digits after the decimal // point. e.g. 123,456,789.01 typedef uint_parser<int, 10, 1, 2> uint2_t; typedef uint_parser<T, 10, 1, -1> uint_parser_t; typedef int_parser<int, 10, 1, -1> int_parser_t; ////////////////////////////////// 2 decimal places Max template <typename ScannerT> static typename parser_result<uint2_t, ScannerT>::type parse_frac_n(ScannerT& scan) { return uint2_t().parse(scan); } ////////////////////////////////// template <typename ScannerT> static typename parser_result<chlit<>, ScannerT>::type parse_exp(ScannerT& scan) { return scan.no_match(); } ////////////////////////////////// template <typename ScannerT> static typename parser_result<int_parser_t, ScannerT>::type parse_exp_n(ScannerT& scan) { return scan.no_match(); } ////////////////////////////////// Thousands separated numbers template <typename ScannerT> static typename parser_result<uint_parser_t, ScannerT>::type parse_n(ScannerT& scan) { typedef typename parser_result<uint_parser_t, ScannerT>::type RT; uint_parser<unsigned, 10, 1, 3> uint3_p; uint_parser<unsigned, 10, 3, 3> uint3_3_p; if (RT hit = uint3_p.parse(scan)) { T n; while (match<> next = (',' >> uint3_3_p[assign(n)]).parse(scan)) { hit.value() *= 1000; hit.value() += n; scan.concat_match(hit, next); } return hit; } return scan.no_match(); } }; // ts_real_p, our thousand separated numeric parser real_parser<double, ts_real_parser_policies<double> > const ts_real_p = real_parser<double, ts_real_parser_policies<double> >();
Copyright © 1998-2002 Joel de Guzman
Permission to copy, use, modify, sell and distribute this document
is granted provided this copyright notice appears in all copies. This document
is provided "as is" without express or implied warranty, and with
no claim as to its suitability for any purpose.