This page is a compilation of best practices using Spirit.
- Separate grammar construction from parsing. I am not entirely sure if this merits an entry here since this is pretty much C++ 101 and not directly related to Spirit. Anyway, since it is short, let’s have it anyway as our first entry. Examples speak volumes and Spirit has lots of examples. For brevity, in the examples, parsing immediately follows the construction of the grammar. Example (example/qi/roman.cpp):
roman roman_parser; // Our grammar /*...*/ bool r = parse(iter, end, roman_parser, result);
In real world usage, this is not efficient. Grammars are meant to be constructed once and used many times. It is always a good idea to separate construction from parsing.
There are exceptions, for sure. Daniel James noted (see comments below) that for non context-free grammars that require a reference to some state, the easiest way is to construct a new grammar each time.
- Avoid complex rules. Rules with complex definitions hurt the compiler badly. We’ve seen rules that are more than a hundred lines long and take a couple of minutes to compile. On some compilers, experience shows that the compile time is exponential in relation to the RHS expression length. C++ compilers were not designed to handle such big expressions and some just couldn’t cope (crashes). It is always best to break complex rules into more manageable, easier to digest parts. Doing so also makes the rules more readable.
- Avoid complex grammars. Try as much as possible to modularize big grammars into smaller sub-grammars. Spirit grammars are composable. Try to identify the grammar parts, especially those that can be reused, and separate them into their own sub-grammars. Reusable grammars are a real advantage. For example, how often have you written a rule for identifiers?
- Take things one step at a time. Don’t try to write a grammar that covers all the complexity of your input. Start with the simplest piece of the input and write a parser for that. Gradually add more rules to your grammar as you cover more complexity in the input.