The Grammar |
The grammar encapsulates a set of rules. The grammar class is a protocol base class for all grammars. It is essentially an interface contract. The grammar is a template class that is parameterized by its derived class, DerivedT, and its context, ContextT. The template parameter ContextT defaults to parser_context, a predefined context. You need not be concerned at all with the ContextT template parameter unless you wish to tweak the low level behavior of the grammar. Detailed information on the ContextT template parameter is provided elsewhere. The grammar relies on the template parameter DerivedT, a grammar subclass to define the actual rules.
Presented above is the public API. There may actually be more template parameters after ContextT. Everything after the ContextT parameter should not be of concern to the client and are strictly for internal use only.
template<
typename DerivedT,
typename ContextT = parser_context>
struct grammar;
A concrete sub-class inheriting from grammar is expected to have a nested template class (or struct) named definition:
It is a nested template class with a typename ScannerT parameter.
Its constructor defines the rules.
Its constructor is passed in a reference to the actual grammar self.
It has a member function named start that returns a reference to the start rule.
struct my_grammar : public grammar<my_grammar>
{
template <typename ScannerT>
struct definition
{
rule<ScannerT> r;
definition(my_grammar const& self) { r = /*..define here..*/; }
rule<ScannerT> const& start() const { return r; }
};
};
Decoupling the scanner type, and hence iterator, from the rules that form a grammar allows the grammar to be used in different contexts possibly using different scanners/iterators. We don't care what scanner we are dealing with. The user defined my_grammar can be instantiated without regard to a scanner type and can be used as a parser using any type of scanner. In short, unlike the rule, the grammar is not tied to a specific scanner type. See "Scanner Business" to see why this is important and to gain further understanding on this scanner-rule coupling problem.
Our grammar above may now be instantiated and put into action:
my_grammar g;
if (parse(first, last, g, space_p).full)
cout << "parsing succeeded\n";
else
cout << "parsing failed\n";
my_grammar IS-A parser and can be used anywhere a parser is expected, even referenced by another rule:
rule<> r = g >> str_p("cool huh?");
Following our original calculator example, here it is now rewritten using a grammar:
struct calculator : public grammar<calculator>
{
template <typename ScannerT>
struct definition
{
definition(calculator const& self)
{
group = '(' >> expression >> ')';
factor = integer | group;
term = factor >> *(('*' >> factor) | ('/' >> factor));
expression = term >> *(('+' >> term) | ('-' >> term));
}
rule<ScannerT> expression, term, factor, group;
rule<ScannerT> const&
start() const { return expression; }
};
};
A fullly working example
with semantic actions can be viewed
here. This is part of the Spirit distribution.
[ See libs/spirit/example/fundamental/calc/calc_plain.cpp ]
self You might notice that the definition of the grammar has a constructor that accepts a const reference to the outer grammar. In the example above, notice that calculator::definition takes in a calculator const& self. While this is unused in the example above, in many cases, this is very useful. The self argument is the definition's window to the outside world. For example, the calculator class might have a reference to some state information that the definition can update while parsing proceeds through semantic actions. |
As the grammar gets quite complicated, it is a good idea to group parts of the grammar into logical modules. For instance, when writing a language, it might be wise to put expressions and statements into separate grammar capsules. The grammar takes advantage of the encapsulation properties of C++ classes. The declarative nature of classes makes it a perfect fit for the definition of grammars. Since the grammar is nothing more than a class declaration we can conveniently publish it in header files. The idea is that once written and fully tested, a grammar can be reused in many contexts. We now have the notion of grammar libraries.
An instance of a grammar may be used in different places multiple times without any problem. The implementation is tuned to allow this at the expense of some overhead. However, we can save considerable cycles and bytes if we are certain that a grammar will only have a single instance. If this is desired, simply define BOOST_SPIRIT_SINGLE_GRAMMAR_INSTANCE before including any spirit header files.
#define
BOOST_SPIRIT_SINGLE_GRAMMAR_INSTANCE
On the other hand, if a grammar is intended to be used in multithreaded code,
we should then define BOOST_SPIRIT_THREADSAFE before including any
spirit header files. In this case it will also be required to link against Boost.Threads
#define
BOOST_SPIRIT_THREADSAFE
Copyright © 1998-2003 Joel de Guzman
Permission to copy, use, modify, sell and distribute this document
is granted provided this copyright notice appears in all copies. This document
is provided "as is" without express or implied warranty, and with
no claim as to its suitability for any purpose.