<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Boost.Spirit &#187; Qi Example</title>
	<atom:link href="http://boost-spirit.com/home/category/qi-example/feed/" rel="self" type="application/rss+xml" />
	<link>http://boost-spirit.com/home</link>
	<description>Home of The Boost.Spirit Library</description>
	<lastBuildDate>Sun, 04 Dec 2011 22:11:29 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>How to Optimize Qi</title>
		<link>http://boost-spirit.com/home/2011/07/23/how-to-optimize-qi/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=how-to-optimize-qi</link>
		<comments>http://boost-spirit.com/home/2011/07/23/how-to-optimize-qi/#comments</comments>
		<pubDate>Sat, 23 Jul 2011 15:49:28 +0000</pubDate>
		<dc:creator>Hartmut Kaiser</dc:creator>
				<category><![CDATA[Qi Example]]></category>
		<category><![CDATA[User Experience]]></category>
		<category><![CDATA[Qi]]></category>

		<guid isPermaLink="false">http://boost-spirit.com/home/2011/07/23/how-to-optimize-qi/</guid>
		<description><![CDATA[Mike Lewis posted a marvelous experience report dubbed ‘Optimizing Boost Spirit &#8211; Blazing fast AST generation using boost::spirit’. He describes how he took an old compiler for the Epoch programming language (which was based on Spirit.Classic) and tuned it for performance using Spirit.Qi and Spirit.Lex. His results are exceptional, he got roughly a thousand fold [...]<br /><div><img src="http://boost-spirit.com/home/wp-content/plugins/gd-star-rating/gfx.php?value=5.0" /></div><div>Rating: 5.0/<strong>5</strong> (5 votes cast)</div><br />]]></description>
			<content:encoded><![CDATA[<p>Mike Lewis posted a marvelous experience report dubbed ‘<a href="http://code.google.com/p/scribblings-by-apoch/wiki/OptimizingBoostSpirit" target="_blank">Optimizing Boost Spirit &#8211; Blazing fast AST generation using boost::spirit</a><em>’.</em> He describes how he took an old compiler for the <a href="http://code.google.com/p/epoch-language/" target="_blank">Epoch</a> programming language (which was based on S<em>pirit.Classic)</em> and tuned it for performance using <em>Spirit.Qi</em> and <em>Spirit.Lex</em>. His results are exceptional, he got roughly a thousand fold speedup compared to the old version. The complete code for his compiler can be downloaded from <a href="http://code.google.com/p/epoch-language/source/browse/" target="_blank">here</a>.</p>
<p><span id="more-1519"></span></p>
<p>He writes:</p>
<blockquote><p>This code illustrates several advanced techniques for parsing large inputs with complex Spirit grammars:</p>
<ul>
<li>Deferred construction and minimal copying of attribute values</li>
<li>Lexical analysis for faster backtracking</li>
<li>A special directive for using qi::symbols alongside a lexer</li>
<li>Linear allocators for faster AST node allocation</li>
<li>Intrusive reference counting for even faster AST node allocation/copying</li>
<li>Grammar transformations for general optimality</li>
<li>Abuse of the &amp;-predicate for skipping expensive productions</li>
<li>Dividing grammars into multiple implementation files for minimal recompilation times</li>
</ul>
</blockquote>
<p>Thanks Mike for sharing your work! I’m sure many <em>Spirit</em> developers will find it very enlightening and encouraging to read about your work. Keep up the excellent work!</p>
<br /><div><img src="http://boost-spirit.com/home/wp-content/plugins/gd-star-rating/gfx.php?value=5.0" /></div><div>Rating: 5.0/<strong>5</strong> (5 votes cast)</div><br />]]></content:encoded>
			<wfw:commentRss>http://boost-spirit.com/home/2011/07/23/how-to-optimize-qi/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Spirit.Qi in the Real World</title>
		<link>http://boost-spirit.com/home/2011/06/08/spirit-qi-in-the-real-world/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=spirit-qi-in-the-real-world</link>
		<comments>http://boost-spirit.com/home/2011/06/08/spirit-qi-in-the-real-world/#comments</comments>
		<pubDate>Wed, 08 Jun 2011 21:34:32 +0000</pubDate>
		<dc:creator>Joel de Guzman</dc:creator>
				<category><![CDATA[Beginner]]></category>
		<category><![CDATA[Boost Con 2011]]></category>
		<category><![CDATA[BoostCon]]></category>
		<category><![CDATA[Experience Level]]></category>
		<category><![CDATA[General]]></category>
		<category><![CDATA[Intermediate]]></category>
		<category><![CDATA[Qi Example]]></category>
		<category><![CDATA[User Experience]]></category>
		<category><![CDATA[BoostCon 2011]]></category>

		<guid isPermaLink="false">http://boost-spirit.com/home/?p=1496</guid>
		<description><![CDATA[This is the first time I missed attending BoostCon (May 15-20, 2011 – Aspen, Colorado). Fortunately, for us who were not able to attend, Marshall Clow uploaded some videos. Here&#8217;s one one that&#8217;s relevant to Spirit: &#8220;Spirit.Qi in the Real World&#8221;, by Robert Stewart. Watch the presentation here: http://blip.tv/boostcon/spirit-qi-in-the-real-world-5254335 You can find the slides here: [...]<br /><div><img src="http://boost-spirit.com/home/wp-content/plugins/gd-star-rating/gfx.php?value=5.0" /></div><div>Rating: 5.0/<strong>5</strong> (1 vote cast)</div><br />]]></description>
			<content:encoded><![CDATA[<p>This is the first time I missed attending BoostCon (May 15-20, 2011 – Aspen, Colorado). Fortunately, for us who were not able to attend, Marshall Clow uploaded some videos. Here&#8217;s one one that&#8217;s relevant to Spirit: &#8220;Spirit.Qi in the Real World&#8221;, by Robert Stewart. Watch the presentation here:</p>
<p><a href="http://blip.tv/boostcon/spirit-qi-in-the-real-world-5254335">http://blip.tv/boostcon/spirit-qi-in-the-real-world-5254335</a></p>
<p>You can find the slides here: <a rel="nofollow" href="https://github.com/boostcon/2011_presentations/raw/master/tue/spirit_qi_in_the_real_world.pdf">https://github.com/boostcon/2011_presentations/raw/master/tue/spirit_qi_in_the_real_world.pdf</a></p>
<blockquote><p>Past sessions on Spirit have focused on introducing Spirit or showing  extracts of real use, intermingled with tutorial highlights. Upon  writing real Spirit.Qi parsers, however, one quickly discovers that &#8220;the  devil is in the details.&#8221; There are special cases, tricks, and idioms  that one must discover by trial and error or, perhaps, by following the  Spirit mailing list, all of which take time and may not be convenient.  In this session, we’ll walk through the development of a Spirit.Qi  parser for printf()-style format strings. The result will be a  replacement for printf() that is typesafe and efficient.</p></blockquote>
<br /><div><img src="http://boost-spirit.com/home/wp-content/plugins/gd-star-rating/gfx.php?value=5.0" /></div><div>Rating: 5.0/<strong>5</strong> (1 vote cast)</div><br />]]></content:encoded>
			<wfw:commentRss>http://boost-spirit.com/home/2011/06/08/spirit-qi-in-the-real-world/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>The Keyword parser</title>
		<link>http://boost-spirit.com/home/2011/04/16/the-keyword-parser/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=the-keyword-parser</link>
		<comments>http://boost-spirit.com/home/2011/04/16/the-keyword-parser/#comments</comments>
		<pubDate>Sat, 16 Apr 2011 17:27:14 +0000</pubDate>
		<dc:creator>teajay</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[Qi Example]]></category>

		<guid isPermaLink="false">http://boost-spirit.com/home/?p=1416</guid>
		<description><![CDATA[The keyword parser construct has recently been added to spirit&#8217;s repository (available in 1.47 or from svn) . Here&#8217;s a small introduction to help you get started using the keyword parsers. Those of you familiar with the Nabialek trick will recognize it&#8217;s working under the hood. What you can achieve with the keywords parser can [...]<br /><div><img src="http://boost-spirit.com/home/wp-content/plugins/gd-star-rating/gfx.php?value=0.0" /></div><div>Rating: 0.0/<strong>5</strong> (0 votes cast)</div><br />]]></description>
			<content:encoded><![CDATA[<p>The keyword parser construct has recently been added to <a href="https://svn.boost.org/svn/boost/trunk/libs/spirit/repository/doc/html/index.html" target="_blank">spirit&#8217;s repository</a> (available in 1.47 or from svn) . Here&#8217;s a small introduction to help you get started using the keyword parsers.</p>
<p>Those of you familiar with the <a title="Nabialek trick" href="http://boost-spirit.com/home/articles/qi-example/nabialek-trick/" target="_blank">Nabialek trick </a>will recognize it&#8217;s working under the hood. What you can achieve with the keywords parser can also be achieved with the Nabialek trick but not always as elegantly or as efficiently.</p>
<p><span id="more-1416"></span></p>
<p>The two examples presented below are included in the spirit repository and can be found in the folder :</p>
<pre class="brush: plain; title: ; notranslate">libs/spirit/repository/example/qi</pre>
<h4>Data members marked by keywords (<a href="https://svn.boost.org/svn/boost/trunk/libs/spirit/repository/example/qi/options.cpp" target="_blank">options.cpp</a>)</h4>
<p>For this small introduction we&#8217;ll consider parsing a program command line.</p>
<p>Options are commonly passed to applications delimited by option keywords :</p>
<pre class="brush: plain; title: ; notranslate">

mySuperCompiler --include includePath --define newSymbol=10 --output output.txt --define newSymbol2=20 --source mySourceFile
</pre>
<p>The order in which the options are specified doesn&#8217;t matter at all. The task of the parser we are going to write is to extract the individual options into some internal data structure we will use to control the program.</p>
<p>Here are the structures we could use to hold the options passed to our command line :</p>
<pre class="brush: cpp; title: ; notranslate">
// A basic preprocessor symbol&lt;/pre&gt;
typedef std::pair&lt;std::string, int32_t&gt; preprocessor_symbol;

struct program_options {
   // symbol container type definition
   typedef std::vector&lt; preprocessor_symbol &gt; preprocessor_symbols_container;&lt;/pre&gt;
   // include paths
   std::vector&lt;std::string&gt; includes;
   // preprocessor symbols
   preprocessor_symbols_container preprocessor_symbols;
   // output file name
   boost::optional&lt;std::string&gt; output_filename;
   // input file name
   std::string source_filename;
};
</pre>
<p>Of course  the structures are adapted to be compatible with fusion in order to get the data pulled into the structures easily.</p>
<p>Now lets define our options rule:</p>
<pre class="brush: cpp; title: ; notranslate">

rule&lt;const char *, program_options(), space_type&gt; kwd_rule;

kwd_rule %= kwd(&quot;--include&quot;)[
                parse_string
            ]
          / kwd(&quot;--define&quot;) [
                parse_string
                &gt;&gt; (
                    (lit('=') &gt; int_) | attr(1)
                   )
            ]
          / kwd(&quot;--output&quot;,0,1)[
                parse_string
            ]
          / kwd(&quot;--source&quot;,1)[
                parse_string
            ]
          ;
</pre>
<p>The first thing to notice here is that we used the %= operator. This means that the parsing construct we just wrote has an attribute type compatible with the attribute type of our adapted structure!</p>
<p>This is one spot were the keyword parsing construct surpasses the Nabialek trick. The Nabialek trick just can&#8217;t do that.</p>
<p>On the next lines we define our keyword parsing constructs.  Writing</p>
<pre class="brush: cpp; title: ; notranslate">kwd(&quot;--include&quot;)[ parse_string ] </pre>
<p>is equivalent to writing:</p>
<pre class="brush: cpp; title: ; notranslate">lit(&quot;--inlude&quot;) &gt; parse_string</pre>
<p>The word &#8220;&#8211;include&#8221; must be followed by a string.</p>
<p>The k<a title="kwd directive" href="http://svn.boost.org/svn/boost/trunk/libs/spirit/repository/doc/html/spirit_repository/qi_components/directives/kwd.html">wd directive</a> has the ability to be combined by using the <a title="Keyword list operator" href="http://svn.boost.org/svn/boost/trunk/libs/spirit/repository/doc/html/spirit_repository/qi_components/operators/keyword_list.html">/ operator</a>. The kwd directive and the operator / work tightly together to achive the goal of attribute compatibility while using the Nabialek trick.</p>
<p>One last thing to notice is the occurrence constraints which can be associated with a kwd directive. It works like the repeat directive and enables to add additional validation checks inside the keyword parsing loop.</p>
<p>Writing</p>
<pre class="brush: cpp; title: ; notranslate">kwd(&quot;--output&quot;,0,1)[ parse_string ] </pre>
<p>means that the keyword &#8220;&#8211;output&#8221; may occur 0 or 1 times at most. If it occurs more than once the parser will fail.</p>
<p>Writing</p>
<pre class="brush: cpp; title: ; notranslate">kwd(&quot;--source&quot;,1)[ parse_string ] </pre>
<p>means that the keyword &#8220;&#8211;source&#8221; must occur once and only once. This works just like the repeat directive.</p>
<p>Using occurrence constraints doesn&#8217;t cost much on the runtime performance and gives the ability to easily enforce constraints which would be otherwise way much more difficult to formulate.</p>
<p>The kwd directive also exists in a case insentive variant : ikwd. You can combine the kwd and ikwd freely inside the same keyword block at the cost of a small runtime overhead.</p>
<h4>Derived structures (<a href="https://svn.boost.org/svn/boost/trunk/libs/spirit/repository/example/qi/derived.cpp" target="_blank">derived.cpp</a>)</h4>
<p>A recent post in the mailing list gave me the idea to provide an example of how the keyword parser can be used to produce different derived structures depending on keywords placed in the input.</p>
<p>Here&#8217;s the problem as described by MM:</p>
<p>&#8220;I have a case where I have a prefix string that will distinguish what will follow it.</p>
<pre class="brush: plain; title: ; notranslate">prefix string - struct members</pre>
<p>this is what is read from the input stream. I have a base struct and 5 derived D1..D5, each derived has a different prefix as a static const std::string member. Parsing the prefix string tells me which struct D1..D5 I should parse after. All these derived structs are fusion adapted. There is a rule for each of the derived.&#8221;</p>
<p>To keep the example simple here are the classes we could consider:</p>
<pre class="brush: cpp; title: ; notranslate">

struct base_type {
    base_type(const std::string &amp;name) : name(name)  {}

    std::string name;
    virtual std::ostream &amp;output(std::ostream &amp;os) const {
        os&lt;&lt;&quot;Base : &quot;&lt;&lt;name;        return os;
    }
};

struct derived1 : public base_type {
    derived1(const std::string &amp;name, unsigned int data1) :
        base_type(name)
      , data1(data1)  {}

    unsigned int data1;
    virtual std::ostream &amp;output(std::ostream &amp;os) const {
        base_type::output(os);
        os&lt;&lt;&quot;, &quot;&lt;&lt;data1;
        return os;
    }
};

struct derived2 : public base_type {
    derived2(const std::string &amp;name, unsigned int data2) :
        base_type(name), data2(data2)  {}

    unsigned int data2;
    virtual std::ostream &amp;output(std::ostream &amp;os) const    {
        base_type::output(os);
        os&lt;&lt;&quot;, &quot;&lt;&lt;data2;
        return os;
    }
};

struct derived3 : public derived2 {
    derived3(const std::string &amp;name, unsigned int data2, double data3) :
      derived2(name,data2)
    , data3(data3)
    {}

    double data3;
    virtual std::ostream &amp;output(std::ostream &amp;os) const    {
        derived2::output(os);
        os&lt;&lt;&quot;, &quot;&lt;&lt;data3;
        return os;
    }
};
</pre>
<p>Our parse result must be a vector of pointers to our base class:</p>
<pre class="brush: cpp; title: ; notranslate">std::vector&lt;base_type*&gt;</pre>
<p>To get that done, we&#8217;ll use semantic actions inside the kwd directive:</p>
<pre class="brush: cpp; title: ; notranslate">

kwd_rule = kwd(&quot;derived1&quot;)[
              ('=' &gt; parse_string &gt; int_ )
              [phx::push_back(_val,phx::new_&lt;derived1&gt;(_1,_2))]
           ]
         / kwd(&quot;derived2&quot;)[
              ('=' &gt; parse_string &gt; int_ )
              [phx::push_back(_val,phx::new_&lt;derived2&gt;(_1,_2))]
           ]
         / kwd(&quot;derived3&quot;)[
              ('=' &gt; parse_string &gt; int_ &gt; double_)
              [phx::push_back(_val,phx::new_&lt;derived3&gt;(_1,_2,_3))]
           ]
           ;
</pre>
<p>This rule will construct new derived classes and append them to our result vector during parsing. The input parsed by this construct is of the form:</p>
<pre class="brush: plain; title: ; notranslate"> derived2 = &quot;object1&quot; 10 derived3= &quot;object2&quot; 40 20.0 </pre>
<h4>Keywords vs Nabialek trick</h4>
<p>Here&#8217;s a small table to compare the features of the keyword parsing constructs and the Nabialek trick to help you decide which solution better suits your needs.</p>

<table id="wp-table-reloaded-id-1-no-1" class="wp-table-reloaded wp-table-reloaded-id-1">
<thead>
	<tr class="row-1 odd">
		<th class="column-1"></th><th class="column-2">Nabialek trick</th><th class="column-3">Keywords parser</th>
	</tr>
</thead>
<tbody>
	<tr class="row-2 even">
		<td class="column-1">Attribute propagation</td><td class="column-2">no</td><td class="column-3">yes</td>
	</tr>
	<tr class="row-3 odd">
		<td class="column-1">Runtime modification of the keyword set</td><td class="column-2">yes</td><td class="column-3">no</td>
	</tr>
	<tr class="row-4 even">
		<td class="column-1">Occurrence constraints</td><td class="column-2">not easily implented</td><td class="column-3">yes</td>
	</tr>
	<tr class="row-5 odd">
		<td class="column-1">Number of keyword limit</td><td class="column-2">available runtime memory</td><td class="column-3">BOOST_VARIANT_LIMIT_TYPES</td>
	</tr>
</tbody>
</table>

<p>The keywords parsing construct can save a lot of typing over the <a title="Nabialek trick" href="http://boost-spirit.com/home/articles/qi-example/nabialek-trick/" target="_blank">Nabialek trick</a> and has in many cases even better performance. It also makes retrieving the parsed data into the program usable structures much easier as it supports attribute propagation. The main limitation of the keyword parser is the number of keywords a keyword block may contain ( limited by the maximum size of the variant type BOOST_VARIANT_LIMIT_TYPES).</p>
<br /><div><img src="http://boost-spirit.com/home/wp-content/plugins/gd-star-rating/gfx.php?value=0.0" /></div><div>Rating: 0.0/<strong>5</strong> (0 votes cast)</div><br />]]></content:encoded>
			<wfw:commentRss>http://boost-spirit.com/home/2011/04/16/the-keyword-parser/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Dispatching on Expectation Point Failures</title>
		<link>http://boost-spirit.com/home/2011/02/28/dispatching-on-expectation-point-failures/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=dispatching-on-expectation-point-failures</link>
		<comments>http://boost-spirit.com/home/2011/02/28/dispatching-on-expectation-point-failures/#comments</comments>
		<pubDate>Mon, 28 Feb 2011 14:23:06 +0000</pubDate>
		<dc:creator>Rob Stewart</dc:creator>
				<category><![CDATA[Intermediate]]></category>
		<category><![CDATA[Qi Example]]></category>
		<category><![CDATA[Spirit2 Release]]></category>
		<category><![CDATA[User Experience]]></category>

		<guid isPermaLink="false">http://boost-spirit.com/home/?p=1364</guid>
		<description><![CDATA[When using expectation points, a parsing failure results in an exception that generically indicates the failure, but probably doesn&#8217;t explain the problem in the most meaningful way. It is possible to attach an error handler to react to the failed match in a more specialized way: That will produce a message like the following on [...]<br /><div><img src="http://boost-spirit.com/home/wp-content/plugins/gd-star-rating/gfx.php?value=5.0" /></div><div>Rating: 5.0/<strong>5</strong> (3 votes cast)</div><br />]]></description>
			<content:encoded><![CDATA[<p>When using expectation points, a parsing failure results in an exception that generically indicates the failure, but probably doesn&#8217;t explain the problem in the most meaningful way. It is possible to attach an error handler to react to the failed match in a more specialized way:</p>
<p><span id="more-1364"></span></p>
<pre class="brush: cpp; title: ; notranslate">
rule = alpha &gt; '!';
on_error&lt;fail&gt;(rule,
   std::cerr &lt;&lt; val(&quot;Expected '!' at offset &quot;) &lt;&lt; (_3 - _1)
      &lt;&lt; &quot; in \&quot; &lt;&lt; std::string(_1, _2) &lt;&lt; '&quot;'
      &lt;&lt; std::endl);
</pre>
<p>That will produce a message like the following on stderr:</p>
<p><code> Expected '!' at offset 7 in "Some input"</code></p>
<p>However, if there&#8217;s more than one expectation point in a rule, then the  diagnostic may be unhelpfully generic. To do otherwise, one must distinguish which  expectation point failed. While it is certainly possible to factor the  grammar into additional rules in order to have at most one expectation  point per rule, that&#8217;s not necessary and can make the grammar less readable than otherwise. Instead, the <em>what</em> parameter (<code>_4</code>) of the error handler can be used:</p>
<pre class="brush: cpp; title: ; notranslate">
rule = alpha &gt; '!';
on_error&lt;fail&gt;(rule,
   std::cerr &lt;&lt; val(&quot;Expected &quot; &lt;&lt; _4 &lt;&lt; &quot; at offset &quot;)
      &lt;&lt; (_3 - _1) &lt;&lt; &quot; in \&quot; &lt;&lt; std::string(_1, _2) &lt;&lt; '&quot;'
      &lt;&lt; std::endl);
</pre>
<p>The <em>what</em> parameter describes the failure.  In the case of an expectation point match failure, it is the name of the parser that failed to match or, if the parser is to match literal text, like <code>'!'</code> in the preceding example, the <em>what</em> parameter will be <code>"literal-char"</code> or similar. In this case, <code>_4</code> will be <code>"literal-char"</code> (in the form of a boost::spirit::utf8_string which is a specialization of std::basic_string), and thus not terribly useful in a diagnostic.</p>
<p>To make the error message more helpful, and especially in rules with more than one literal parser to distinguish, create distinct, named rules:</p>
<pre class="brush: cpp; title: ; notranslate">
exclamation = lit('!');
exclamation.name(&quot;!&quot;);
rule = alpha &gt; exclamation;
on_error&lt;fail&gt;(rule,
   std::cerr &lt;&lt; val(&quot;Expected &quot;) &lt;&lt; _4 &lt;&lt; &quot; at offset &quot;
      &lt;&lt; (_3 - _1) &lt;&lt; &quot; in \&quot; &lt;&lt; std::string(_1, _2) &lt;&lt; '&quot;'
      &lt;&lt; std::endl);
</pre>
<p>This will report <code>Expected ! at offset...</code> when the exclamation rule fails to match.</p>
<p>Since an expectation point failure is distinguished by the <em>what</em> parameter, it follows that the <em>what</em> parameter can be used to dispatch to different behavior in the error handler based upon which expectation point failed to match. Doing so can be as simple as passing the <em>what</em> parameter to an error handling function which can use normal C++ techniques for dispatch such as cascading if-else&#8217;s or a map lookup, using the <em>what</em> string as the key to find a function to call. However, Phoenix offers power to do that work within the context of the <code>on_error()</code> call:</p>
<pre class="brush: cpp; title: ; notranslate">
semicolon = lit(';');
semicolon.name(&quot;;&quot;);
rule = alpha &gt; semicolon &gt; alpha;
on_error&lt;fail&gt;(rule,
   let(_a = bind(&amp;boost::spirit::info::tag, _4))
   [
      if_(&quot;;&quot; == _a)
      [
         report_missing(_4, _1, _2, _3)
      ]
      .else_
      [
         if_(&quot;alpha&quot; == _a)
         [
            report_missing(&quot;second word&quot;, _1, _2, _3)
         ]
         .else_
         [
            report_error(_4, _1, _2, _3)
         ]
      ]
   ]);
</pre>
<p>For the last example to compile, a number of include and using directives are necessary beyond the basics you are probably accustomed to seeing:</p>
<pre class="brush: cpp; title: ; notranslate">
#include &lt;boost/spirit/home/phoenix/bind/bind_member_variable.hpp&gt;
#include &lt;boost/spirit/home/phoenix/scope/let.hpp&gt;
#include &lt;boost/spirit/home/phoenix/scope/local_variable.hpp&gt;
#include &lt;boost/spirit/home/phoenix/statement/if.hpp&gt;
using boost::phoenix::local_names;
</pre>
<p>It would seem, at first blush, that comparing to <code>_4</code> directly should work, but it doesn&#8217;t because <code>_4</code> is a Phoenix actor. Instead, a string type is needed to support the comparisons against the string literals for dispatching. In this example, a local Phoenix variable, <code>_a</code> is declared and assigned the result of binding <code>_4</code> to boost::spirit::info::tag, the field of the boost::spirit::info struct that contains the <em>what</em> string. Thus, <code>_a</code> is a variable local to the error handler that is bound to the boost::spirit::utf8_string that describes the error and supports comparisons. Note the use of Phoenix&#8217;s <code>let</code> construct to declare a local variable scope. (This <code>_a</code>, which is <code>boost::phoenix::local_names::_a</code>, can be ambiguous with <code>boost::spirit::qi::_a</code>, depending upon using directives and declarations.)</p>
<p>The two functions, <code>report_missing()</code> and <code>report_error()</code> are not defined here, but presumably would report on stderr or raise an exception to indicate that a parsing error occurred, and would report the error context from the input range <code>[_1,_2)</code> and would note the error location, within that range, as given by <code>_3</code>.</p>
<p>When dispatching in this manner, there can be other parsing errors besides expectation point match failures, hence the final <code>.else_</code> branch in the example error handler. For lack of a better response, the example just reports a generic error message that includes the <em>what</em> parameter's text to give some sort of explanation. A real world rule would possibly provide a more context-specific diagnostic.</p>
<p>A final caution regarding this technique: the compile time, maintenance burden, and code size increases with each additional expectation point to be handled. Using a map-based dispatch may well be better when the number of expectation points grows. However, the diagnostic text generation may get out of synchronization with the point in the grammar triggering it because of their being located in different parts of the code.</p>
<p>There is another way to keep the diagnostic text near the rule triggering an error, while avoiding a great deal of code within the grammar. It involves collecting the rule name and corresponding diagnostic in a structure stored in an array that is then passed to an error handler that uses the <em>what</em> parameter to select a diagnostic from the array. If that was as clear as mud, don't worry. The code should make it clear. Let's start with the rule name to diagnostic mapping which combines the structure and array within a class template:</p>
<pre class="brush: cpp; title: ; notranslate">
template &lt;size_t N&gt;
class diagnostics
{
public:
   diagnostics();

   // Adds a tag and diagnostic message pair to self.
   void
   add(char const * _tag, char const * _diagnostic);

   // Returns the diagnostic, if any, for _tag.
   char const *
   operator [](char const * _tag) const;

private:
   struct entry
   {
      char const * tag;
      char const * diagnostic;
   };

   entry  entries_[N];
   size_t size_;
};
</pre>
<p>diagnostics, as written, simply saves pointers to string literals. For more flexibility, it could store real strings (std::basic_string&lt;&gt;s, for instance), but this design is useful and simpler for exposition. To use diagnostics, one must create a grammar data member for each rule that will use it, and then populate it as needed by the rule:</p>
<pre class="brush: cpp; title: ; notranslate">
semicolon = lit(';');
semicolon.name(&quot;;&quot;);
rule = alpha &gt; semicolon &gt; alpha;
diags.add(&quot;;&quot;, &quot;Missing semicolon after first word&quot;);
diags.add(&quot;alpha&quot;, &quot;Missing second word&quot;);
on_error&lt;fail&gt;(rule,
   error_handler(ref(diags), _1, _2, _3, _4));
</pre>
<p>Notice how the first expectation point is identified by a named rule for the required semicolon, which will produce an error message or exception containing the diagnostic text <code>"Missing semicolon after first word"</code>. Similarly, if there is no word after a semicolon, then the diagnostic <code>"Missing second word"</code> will be used because the second alpha will fail to match. In each case, the expectation is that the error handler will use <code>_4</code> to indicate which rule fail to satisfy an expectation point.</p>
<p>To round out this example, here's how <code>error_handler()</code> might look:</p>
<pre class="brush: cpp; title: ; notranslate">
struct error_handler_impl
{
   template &lt;class, class, class, class, class&gt;
   struct result { typedef void type; };

   template &lt;class D, class B, class E, class W, class I&gt;
   void
   operator ()(D const &amp; _diagnostics, B _begin, E _end,
      W _where, I const &amp; _info) const
   {
      utf8_string const &amp; tag(_info.tag);
      char const * const what(tag.c_str());
      char const * diagnostic(_diagnostics[what]);
      std::string scratch;
      if (!diagnostic)
      {
         scratch.reserve(25 + tag.length());
         scratch = &quot;Invalid syntax: expected &quot;;
         scratch += tag;
         diagnostic = scratch.c_str();
      }
      raise_parsing_error(diagnostic, _begin, _end,
         _where);
   }
};
phx::function&lt;error_handler_impl&gt; error_handler;
</pre>
<p>You're probably wondering where the implementation of diagnostics' member functions are to be found. Here they are:</p>
<pre class="brush: cpp; title: ; notranslate">
template &lt;size_t N&gt;
inline
diagnostics&lt;N&gt;::diagnostics()
   : size_(0)
{
}

template &lt;size_t N&gt;
void
diagnostics&lt;N&gt;::add(char const * const _tag,
   char const * const _diagnostic)
{
   assert(size_ &lt; N);
   entry &amp; e(entries_[size_++]);
   e.tag = _tag;
   e.diagnostic = _diagnostic;
}

template &lt;size_t N&gt;
char const *
diagnostics&lt;N&gt;::operator [](char const * const _tag) const
{
   for (size_t i(0); i &lt; size_; ++i)
   {
      entry const &amp; e(entries_[i]);
      if (0 == std::strcmp(e.tag, _tag))
      {
         return e.diagnostic;
      }
   }
   return 0;
}
</pre>
<p>It should now be apparent that there are numerous ways to dispatch error handling when using expectation points, but all revolve around decoding the <em>what</em> parameter. In the end, factor your grammar to be functional and readable and then consider which expectation point failure dispatching technique fits best without sacrificing readability or performance.</p>
<br /><div><img src="http://boost-spirit.com/home/wp-content/plugins/gd-star-rating/gfx.php?value=5.0" /></div><div>Rating: 5.0/<strong>5</strong> (3 votes cast)</div><br />]]></content:encoded>
			<wfw:commentRss>http://boost-spirit.com/home/2011/02/28/dispatching-on-expectation-point-failures/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Parsing Escaped String Input Using Spirit.Qi</title>
		<link>http://boost-spirit.com/home/2010/11/13/parsing-escaped-string-input-using-spirit-qi/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=parsing-escaped-string-input-using-spirit-qi</link>
		<comments>http://boost-spirit.com/home/2010/11/13/parsing-escaped-string-input-using-spirit-qi/#comments</comments>
		<pubDate>Sat, 13 Nov 2010 20:28:40 +0000</pubDate>
		<dc:creator>Hartmut Kaiser</dc:creator>
				<category><![CDATA[Beginner]]></category>
		<category><![CDATA[Intermediate]]></category>
		<category><![CDATA[Qi Example]]></category>
		<category><![CDATA[Qi]]></category>

		<guid isPermaLink="false">http://boost-spirit.com/home/?p=1182</guid>
		<description><![CDATA[Jeroen Habraken (a.k.a VeXocide) sent an article about parsing escaped strings using Qi, which we happily publish for everybody to read. Thanks Jeroen! Continue reading here. Rating: 4.5/5 (2 votes cast)<br /><div><img src="http://boost-spirit.com/home/wp-content/plugins/gd-star-rating/gfx.php?value=4.5" /></div><div>Rating: 4.5/<strong>5</strong> (2 votes cast)</div><br />]]></description>
			<content:encoded><![CDATA[<p>Jeroen Habraken (a.k.a VeXocide) sent an article about parsing escaped strings using <em>Qi</em>, which we happily publish for everybody to read. Thanks Jeroen!</p>
<p>Continue reading <a href="http://boost-spirit.com/home/articles/qi-example/parsing-escaped-string-input-using-spirit-qi/">here</a>.</p>
<br /><div><img src="http://boost-spirit.com/home/wp-content/plugins/gd-star-rating/gfx.php?value=4.5" /></div><div>Rating: 4.5/<strong>5</strong> (2 votes cast)</div><br />]]></content:encoded>
			<wfw:commentRss>http://boost-spirit.com/home/2010/11/13/parsing-escaped-string-input-using-spirit-qi/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>S-expressions and variant</title>
		<link>http://boost-spirit.com/home/2010/03/11/s-expressions-and-variants/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=s-expressions-and-variants</link>
		<comments>http://boost-spirit.com/home/2010/03/11/s-expressions-and-variants/#comments</comments>
		<pubDate>Fri, 12 Mar 2010 00:24:42 +0000</pubDate>
		<dc:creator>Joel de Guzman</dc:creator>
				<category><![CDATA[Build a Compiler]]></category>
		<category><![CDATA[Qi Example]]></category>

		<guid isPermaLink="false">http://boost-spirit.com/home/?p=1039</guid>
		<description><![CDATA[I have a mixed relationship with variant&#8230; I just wrote a parser for S-expressions (that will be the basis of ASTs and intermediate types in my planned &#8220;write-a-compiler&#8221; article series). The parser itself is easy, but as always, I spent more time on the underlying data structures. What are S-expressions? S-expressions, also called sexps, are [...]<br /><div><img src="http://boost-spirit.com/home/wp-content/plugins/gd-star-rating/gfx.php?value=4.0" /></div><div>Rating: 4.0/<strong>5</strong> (3 votes cast)</div><br />]]></description>
			<content:encoded><![CDATA[<p>I have a mixed relationship with variant&#8230;</p>
<p>I just wrote a parser for S-expressions (that will be the basis of ASTs and intermediate types in my planned &#8220;write-a-compiler&#8221; article series). The parser itself is easy, but as always, I spent more time on the underlying data structures. </p>
<p><span id="more-1039"></span> </p>
<p>What are S-expressions? S-expressions, also called sexps, are recursive, list based, data structures. Being recursive, they can represent hierarchical information. S-expressions are parenthesized prefix expressions, known for their use in LISP (and its sibling Scheme). Here&#8217;s a simple sexp:</p>
<pre class="brush: cpp; title: ; notranslate">
(* 2 (+ 3 4))
</pre>
<p>The sexp above corresponds to this infix expression:</p>
<pre class="brush: cpp; title: ; notranslate">
(2 * (3 + 4))
</pre>
<p>S-expressions are simple and infinitely powerful beasts as evident in applications that use LISP as their scripting language. They can represent code and data. Some people even use S-expressions as a suitable (and terser!) replacement for XML. The in-memory data structures are very easy to use, transform and manipulate, traverse and compile or accumulate results from.</p>
<p>The plan is to use S-expressions as our AST representation and embed a minimal LISP/Scheme interpreter <strong>IN</strong> the compiler. This implies that along the way, we&#8217;ll be building an S-expression parser and a LISP/Scheme interpreter. How cool is that? &#8230; We&#8217;re talking about scripting the compiler with an interpreter! <img src='http://boost-spirit.com/home/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>I needed a dynamic data type that can represent the S-expressions. I called it utree, short for universal-tree. I want it to be as simple as it can be and fast and tight in memory footprint. Boost variant was simply out of the question (I used it in one early prototypes). For one, it failed a basic requirement (tight memory footprint). The padding and the way it aligns the &#8220;what-type&#8221; integer member is quite wasteful. It uses a conservative alignment using the worst alignment of the types in the union. Thus if you have a type in there that aligns to 8 bytes, variant requires another 8 bytes just for the type discriminator! Try it out:</p>
<pre class="brush: cpp; title: ; notranslate">
struct x { void* a; void* b; void* c; };
/***/
std::cout &lt;&lt; sizeof(x) &lt;&lt; std::endl;
std::cout &lt;&lt; sizeof(boost::variant&lt;x, int, double&gt;) &lt;&lt; std::endl;
</pre>
<p>I get: 12 and 24 respectively (32 bit system).</p>
<p>I ended up with 40 bytes in my initial prototype (using STL containers and variant) and later squeezed that to 24 (minimum). I did away with variant in my latest version and got 16 bytes. In this case, I &#8220;stole&#8221; unused padding bits from the data to store the discriminator.   With this 16 bytes, I have nil, bool, int, double, string and (double linked) list. The string itself steals memory when it can (i.e. it stores the string in the union when it can and only uses the heap when needed). The string steals as much as it can. So, on 32 bit systems, it can store in-situ as much as 14 bytes. That&#8217;s a lot for storing simple strings like symbols and identifiers. On 64 bit systems, you can store a lot more in-situ and minimize heap usage more.</p>
<p>At this point, I feel like writing my own variant type that can do such things (intrusive variant?). Barring the use of Boost.Variant, I needed to write my own data structures (double linked list). I really wanted to use Boost.Intrusive which is quite efficient, but because I had to squeeze my own variant in there, I had to make use of unions which require PODs!</p>
<p>Here&#8217;s the work in progress:<br />
<a href="http://boost-spirit.com/dl_more/scheme/scheme_v0.2/">http://boost-spirit.com/dl_more/scheme/scheme_v0.2/</a></p>
<p>Here&#8217;s the utree API:</p>
<pre class="brush: cpp; title: ; notranslate">

    ///////////////////////////////////////////////////////////////////////////
    // The main utree (Universal Tree) class
    // The utree is a hierarchical, dynamic type that can store:
    //  - a nil
    //  - a bool
    //  - an integer
    //  - a double
    //  - a string (textual or binary)
    //  - a (doubly linked) list of utree
    //  - a reference to a utree
    //
    // The utree has minimal memory footprint. The data structure size is
    // 16 bytes on a 32-bit platform. Being a container of itself, it can
    // represent tree structures.
    ///////////////////////////////////////////////////////////////////////////
    class utree
    {
    public:

        typedef utree value_type;
        typedef detail::list::node_iterator&lt;utree&gt; iterator;
        typedef detail::list::node_iterator&lt;utree const&gt; const_iterator;
        typedef utree&amp; reference;
        typedef utree const&amp; const_reference;
        typedef std::ptrdiff_t difference_type;
        typedef std::size_t size_type;

        struct nil {};

        utree();
        explicit utree(bool b);
        explicit utree(unsigned int i);
        explicit utree(int i);
        explicit utree(double d);
        explicit utree(char const* str);
        explicit utree(char const* str, std::size_t len);
        explicit utree(std::string const&amp; str);
        explicit utree(boost::reference_wrapper&lt;utree&gt; ref);

        utree(utree const&amp; other);
        ~utree();

        utree&amp; operator=(utree const&amp; other);
        utree&amp; operator=(bool b);
        utree&amp; operator=(unsigned int i);
        utree&amp; operator=(int i);
        utree&amp; operator=(double d);
        utree&amp; operator=(char const* s);
        utree&amp; operator=(std::string const&amp; s);
        utree&amp; operator=(boost::reference_wrapper&lt;utree&gt; ref);

        template &lt;typename F&gt;
        typename F::result_type
        static visit(utree const&amp; x, F f);

        template &lt;typename F&gt;
        typename F::result_type
        static visit(utree&amp; x, F f);

        template &lt;typename F&gt;
        typename F::result_type
        static visit(utree const&amp; x, utree const&amp; y, F f);

        template &lt;typename F&gt;
        typename F::result_type
        static visit(utree&amp; x, utree const&amp; y, F f);

        template &lt;typename F&gt;
        typename F::result_type
        static visit(utree const&amp; x, utree&amp; y, F f);

        template &lt;typename F&gt;
        typename F::result_type
        static visit(utree&amp; x, utree&amp; y, F f);

        template &lt;typename T&gt;
        void push_back(T const&amp; val);

        template &lt;typename T&gt;
        void push_front(T const&amp; val);

        template &lt;typename T&gt;
        iterator insert(iterator pos, T const&amp; x);

        template &lt;typename T&gt;
        void insert(iterator pos, std::size_t, T const&amp; x);

        template &lt;typename Iter&gt;
        void insert(iterator pos, Iter first, Iter last);

        template &lt;typename Iter&gt;
        void assign(Iter first, Iter last);

        void clear();
        void pop_front();
        void pop_back();
        iterator erase(iterator pos);
        iterator erase(iterator first, iterator last);

        utree&amp; front();
        utree&amp; back();
        utree const&amp; front() const;
        utree const&amp; back() const;

        utree&amp; operator[](std::size_t i);
        utree const&amp; operator[](std::size_t i) const;

        void swap(utree&amp; other);

        iterator begin();
        iterator end();
        const_iterator begin() const;
        const_iterator end() const;

        bool empty() const;
        std::size_t size() const;
    };

    bool operator==(utree const&amp; a, utree const&amp; b);
    bool operator&lt;(utree const&amp; a, utree const&amp; b);
    bool operator!=(utree const&amp; a, utree const&amp; b);
    bool operator&gt;(utree const&amp; a, utree const&amp; b);
    bool operator&lt;=(utree const&amp; a, utree const&amp; b);
    bool operator&gt;=(utree const&amp; a, utree const&amp; b);
</pre>
<br /><div><img src="http://boost-spirit.com/home/wp-content/plugins/gd-star-rating/gfx.php?value=4.0" /></div><div>Rating: 4.0/<strong>5</strong> (3 votes cast)</div><br />]]></content:encoded>
			<wfw:commentRss>http://boost-spirit.com/home/2010/03/11/s-expressions-and-variants/feed/</wfw:commentRss>
		<slash:comments>31</slash:comments>
		</item>
		<item>
		<title>Tracking the Input Position While Parsing</title>
		<link>http://boost-spirit.com/home/2010/03/05/tracking-the-input-position-while-parsing/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=tracking-the-input-position-while-parsing</link>
		<comments>http://boost-spirit.com/home/2010/03/05/tracking-the-input-position-while-parsing/#comments</comments>
		<pubDate>Fri, 05 Mar 2010 16:21:53 +0000</pubDate>
		<dc:creator>Peter Schüller</dc:creator>
				<category><![CDATA[Advanced]]></category>
		<category><![CDATA[Beginner]]></category>
		<category><![CDATA[Qi Example]]></category>
		<category><![CDATA[MultiPass]]></category>
		<category><![CDATA[Qi]]></category>

		<guid isPermaLink="false">http://boost-spirit.com/home/?p=1026</guid>
		<description><![CDATA[The following article is about tracking the parsing position with Spirit V2. This is useful for generating error messages which tell the user exactly where an error has occurred. We also show how to use Spirit V2 to parse from an input stream without first reading the whole stream into a std::string. Continue reading » [...]<br /><div><img src="http://boost-spirit.com/home/wp-content/plugins/gd-star-rating/gfx.php?value=5.0" /></div><div>Rating: 5.0/<strong>5</strong> (1 vote cast)</div><br />]]></description>
			<content:encoded><![CDATA[<p>The following article is about tracking the parsing position with <em>Spirit</em> V2. This is useful for generating error messages which tell the user exactly where an error has occurred. We also show how to use <em>Spirit</em> V2 to parse from an input stream without first reading the whole stream into a std::string.</p>
<p><a href="http://boost-spirit.com/home/articles/qi-example/tracking-the-input-position-while-parsing/">Continue reading »</a></p>
<br /><div><img src="http://boost-spirit.com/home/wp-content/plugins/gd-star-rating/gfx.php?value=5.0" /></div><div>Rating: 5.0/<strong>5</strong> (1 vote cast)</div><br />]]></content:encoded>
			<wfw:commentRss>http://boost-spirit.com/home/2010/03/05/tracking-the-input-position-while-parsing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Parsing Skippers and Skipping Parsers</title>
		<link>http://boost-spirit.com/home/2010/02/24/parsing-skippers-and-skipping-parsers/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=parsing-skippers-and-skipping-parsers</link>
		<comments>http://boost-spirit.com/home/2010/02/24/parsing-skippers-and-skipping-parsers/#comments</comments>
		<pubDate>Wed, 24 Feb 2010 13:32:08 +0000</pubDate>
		<dc:creator>Hartmut Kaiser</dc:creator>
				<category><![CDATA[Beginner]]></category>
		<category><![CDATA[Qi Example]]></category>
		<category><![CDATA[Tip of the Day]]></category>
		<category><![CDATA[Qi]]></category>

		<guid isPermaLink="false">http://boost-spirit.com/home/?p=989</guid>
		<description><![CDATA[Spirit supports skipper based parsing since its very invention. So this is definitely not something new to Spirit V2. Nevertheless, the recent discussion on the Spirit mailing list around the semantics of Qi&#8217;s lexeme[] directive shows the need for some clarification. Today I try to answer questions like: &#8220;What does it mean to use a [...]<br /><div><img src="http://boost-spirit.com/home/wp-content/plugins/gd-star-rating/gfx.php?value=4.6" /></div><div>Rating: 4.6/<strong>5</strong> (7 votes cast)</div><br />]]></description>
			<content:encoded><![CDATA[<p><em>Spirit</em> supports skipper based parsing since its very invention. So this is definitely not something new to Spirit V2. Nevertheless, the recent discussion on the <a href="http://boost-spirit.com/home/feedback-and-support/">Spirit mailing list</a> around the semantics of <em>Qi&#8217;s</em> <span style="font-family: Courier New;">lexeme[]</span> directive shows the need for some clarification. Today I try to answer questions like: &#8220;What does it mean to use a skipper while parsing?&#8221;, or &#8220;When do I want to use a skipper and when not?&#8221;.</p>
<p><span id="more-989"></span></p>
<p>While parsing some formatted data stream it is very often desirable to ignore some parts of the input. A common example would be the need to skip whitespace and comments while parsing some computer language. Certainly it is possible to explicitly account for the tokens to skip (such as the whitespace or the comments) while writing the grammar. But this can get very tedious as those tokens are valid to appear at any point in the input.</p>
<p>For the sake of simplicity, let us assume we want to parse a simple key/value expression: <span style="font-family: Courier New;">key=value</span>, where we want to allow for any number of space characters before, in between, or after the <span style="font-family: Courier New;">key</span> or the <span style="font-family: Courier New;">value</span>. A naive grammar matching the plain key/value pair without whitespace skipping would look like (see <a href="http://boost-spirit.com/home/articles/qi-example/parsing-a-list-of-key-value-pairs-using-spirit-qi/">Parsing a List of Key-Value Pairs Using Spirit.Qi</a> for more details):</p>
<pre class="brush: cpp; title: ; notranslate">
pair  =  key &gt;&gt; '=' &gt;&gt; value;
key   =  qi::char_(&quot;a-zA-Z_&quot;) &gt;&gt; *qi::char_(&quot;a-zA-Z_0-9&quot;);
value = +qi::char_(&quot;a-zA-Z_0-9&quot;);
</pre>
<p>If we want to explicitly accommodate the rule <span style="font-family: Courier New;">pair</span> to match any interspersed space characters we get:</p>
<pre class="brush: cpp; title: ; notranslate">
pair  = *space &gt;&gt; key &gt;&gt; *space &gt;&gt; '=' *space &gt;&gt; value &gt;&gt; *space;
</pre>
<p>which, while it produces the desired result, is not only error prone, but additionally difficult to write, to understand, and to maintain. If we look closer we see, that the process of skipping the whitespace tokens is easily automated. It seems to be sufficient to insert a repeated invocation of the <span style="font-family: Courier New;">space</span> parser (or generally, any skip parser) in between the elements of the user defined parser expression sequences.</p>
<p>In fact, that is exactly what <em>Spirit</em> can do for you! The library invokes any supplied skip parser upon entry to the parse member function of any parser conforming to the <a href="http://www.boost.org/doc/libs/1_42_0/libs/spirit/doc/html/spirit/qi/reference/parser_concepts/primitiveparser.html"><span style="font-family: Courier New;">PrimitiveParser</span></a> concept. The skip parser has to be supplied by calling a special API function: <span style="font-family: Courier New;">phrase_parse:</span></p>
<pre class="brush: cpp; title: ; notranslate">
namespace qi = boost::spirit::qi;
typedef std::string::const_iterator iterator;

qi::rule&lt;iterator, qi::space_type&gt; pair = key &gt;&gt; '=' &gt;&gt; value;
qi::rule&lt;iterator&gt; key = qi::char_(&quot;a-zA-Z_&quot;) &gt;&gt; *qi::char_(&quot;a-zA-Z_0-9&quot;);
qi::rule&lt;iterator&gt; value = +qi::char_(&quot;a-zA-Z_0-9&quot;);

std::string input(&quot; key = value &quot;);
iterator_type begin = input.begin();
iterator_type end = input.end();
qi::phrase_parse(begin, end, pair, qi::space);
</pre>
<p>This code snippet illustrates several important things:</p>
<ul>
<li>The function <span style="font-family: Courier New;">qi::phrase_parse</span> is equivalent to the API function <span style="font-family: Courier New;">qi::parse</span> except for its additional parameter, the skip parser. Our example utilizes <span style="font-family: Courier New;">qi::space</span>, but it is possible to use any other, even more complex parser expression as the skipper instead.</li>
<li>All rules which we want to perform the skip parsing need to be declared with the type of the skip parser they are going to be used with. Our example specifies the type of the <span style="font-family: Courier New;">qi::space</span> parser expression, which is <span style="font-family: Courier New;">qi::space_type</span>. For more complex parser expressions you might want to use a (mini) grammar or take advantage of <span style="font-family: Courier New;">BOOST_TYPEOF</span> to let the compiler deduce the actual type.</li>
<li>All rules which should not perform skip parsing have to be declared without an additional skip parser type. These rules behave like an implicit <span style="font-family: Courier New;">lexeme[]</span> directive (for more information about <span style="font-family: Courier New;">lexeme[]</span>, see below), they inhibit the invocation of the skip parser even if they are executed as part of a rule with an associated skipper.</li>
</ul>
<p>In the example above we suppressed skipping while matching either the <span style="font-family: Courier New;">key</span> or the <span style="font-family: Courier New;">value,</span> otherwise our grammar would match any additional <span style="font-family: Courier New;">space</span> character inside the <span style="font-family: Courier New;">key</span> or <span style="font-family: Courier New;">value</span> as well. Remember, the expression <span style="font-family: Courier New;">char_</span> conforms to the <a href="http://www.boost.org/doc/libs/1_42_0/libs/spirit/doc/html/spirit/qi/reference/parser_concepts/primitiveparser.html">PrimitiveParser</a> concept, it will execute the skip parser for each of its invocations. In this case any skip parser would be executed in between any two of the matched characters.</p>
<p>Sometimes it is necessary to turn of skipping for a smaller part of the grammar only. For this purpose Spirit implements the <span style="font-family: Courier New;">lexeme[]</span> directive. This directive inhibits skipping during the execution of the embedded parser. For instance, parsing a quoted string of alphanumeric characters would look like this:</p>
<pre class="brush: cpp; title: ; notranslate">
string = lexeme['&quot;' &gt;&gt; *alnum &gt;&gt; '&quot;'];
</pre>
<p>Here the lexeme directive disables skipping while matching the string, which avoids &#8216;loosing&#8217; characters otherwise matched by the skipper. Please note: <span style="font-family: Courier New;">lexeme[]</span> performs a pre-skip step, even if it is not a <a href="http://www.boost.org/doc/libs/1_42_0/libs/spirit/doc/html/spirit/qi/reference/parser_concepts/primitiveparser.html">PrimitiveParser</a> itself (it is essentially considered to be a logical primitive by design). If this is undesired, you can utilize the <span style="font-family: Courier New;">no_skip[]</span> directive instead:</p>
<pre class="brush: cpp; title: ; notranslate">
string = '&quot;' &gt;&gt; no_skip[*alnum] &gt;&gt; '&quot;';
</pre>
<p>This parser will match all the characters in between the quotes, even if the string starts with a character sequence matched by the applied skip parser. The <span style="font-family: Courier New;">no_skip[]</span> directive is semantically equivalent to <span style="font-family: Courier New;">lexeme[]</span> except it does not perform a pre-skip before executing the embedded parser. Note: the <span style="font-family: Courier New;">no_skip[]</span> directive has been added only recently. It will be available starting with the next release (Boost V1.43).</p>
<p>This short article would not be complete without mentioning the <span style="font-family: Courier New;">skip[]</span> directive. This directive is the counterpart to <span style="font-family: Courier New;">lexeme[]</span>. It enables skipping for the embedded parser. Without any argument it can be used inside a lexeme or no_skip directive only. In this case it just re-enables the outer skipper:</p>
<pre class="brush: cpp; title: ; notranslate">
string = lexeme['&quot;' &gt;&gt; *(alpha | skip[digit]) &gt;&gt; '&quot;'];
</pre>
<p>This (purely hypothetical) parser would enable skipping inside a string as long as it matches digits. But the skip directive can do more. It may take an additional argument allowing to specify a new skipper, for instance:</p>
<pre class="brush: cpp; title: ; notranslate">
skip(qi::space)[*alnum]
</pre>
<p>which will skip spaces while executing the embedded <span style="font-family: Courier New;">*alnum</span> parser. This form of the directive can be applied for two purposes. It can be used either for changing the current skip parser or to establish skipping inside a context otherwise not doing skipping at all (even if invoked with the <span style="font-family: Courier New;">qi::parse()</span> API function).</p>
<p>For more detailed information about all the mentioned directives please see the corresponding documentation.</p>
<br /><div><img src="http://boost-spirit.com/home/wp-content/plugins/gd-star-rating/gfx.php?value=4.6" /></div><div>Rating: 4.6/<strong>5</strong> (7 votes cast)</div><br />]]></content:encoded>
			<wfw:commentRss>http://boost-spirit.com/home/2010/02/24/parsing-skippers-and-skipping-parsers/feed/</wfw:commentRss>
		<slash:comments>27</slash:comments>
		</item>
		<item>
		<title>Parsing Arbitrary Things in Any Sequence</title>
		<link>http://boost-spirit.com/home/2010/02/17/parsing-arbitrary-things-in-any-sequence/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=parsing-arbitrary-things-in-any-sequence</link>
		<comments>http://boost-spirit.com/home/2010/02/17/parsing-arbitrary-things-in-any-sequence/#comments</comments>
		<pubDate>Wed, 17 Feb 2010 15:45:25 +0000</pubDate>
		<dc:creator>Hartmut Kaiser</dc:creator>
				<category><![CDATA[Beginner]]></category>
		<category><![CDATA[Intermediate]]></category>
		<category><![CDATA[Qi Example]]></category>
		<category><![CDATA[Tip of the Day]]></category>
		<category><![CDATA[Qi]]></category>

		<guid isPermaLink="false">http://boost-spirit.com/home/?p=977</guid>
		<description><![CDATA[Recently, there have been a couple of questions on the Spirit mailing list asking how to parse as set of things known in advance in any sequence and any combination. A simple example would be a list of key/value pairs with known keys but the keys may be ordered in any sequence. This use case [...]<br /><div><img src="http://boost-spirit.com/home/wp-content/plugins/gd-star-rating/gfx.php?value=5.0" /></div><div>Rating: 5.0/<strong>5</strong> (1 vote cast)</div><br />]]></description>
			<content:encoded><![CDATA[<p>Recently, there have been a couple of questions on the <em><a href="http://boost-spirit.com/home/info/mailing-list/">Spirit mailing list</a></em> asking how to parse as set of things known in advance in any sequence and any combination. A simple example would be a list of key/value pairs with known keys but the keys may be ordered in any sequence. This use case seems to be quite common. Fortunately Spirit provides you with a predefined parser component designed for exactly that purpose: the permutation parser.</p>
<p><span id="more-977"></span></p>
<p><em>Spirit&#8217;s</em> permutation parser <span style="font-family: Courier New;">a ^ b</span> matches either <span style="font-family: Courier New;">a</span>, <span style="font-family: Courier New;">b</span>, <span style="font-family: Courier New;">a &gt;&gt; b</span>, or <span style="font-family: Courier New;">b &gt;&gt; a</span>, where <span style="font-family: Courier New;">a</span> and <span style="font-family: Courier New;">b</span> can be arbitrary parser expressions. Just like normal sequences this operator can be utilized to combine more than two operands. For instance, the expression <span style="font-family: Courier New;">a ^ b ^ c</span> will match <span style="font-family: Courier New;">a</span> or <span style="font-family: Courier New;">b</span> or <span style="font-family: Courier New;">c</span> (or an combination thereof) in any sequence. The attribute propagation rule for the permutation parser is</p>
<pre class="brush: cpp; title: ; notranslate">
a: A, b: B --&gt; (a ^ b): tuple&lt;optional&lt;A&gt;, optional&lt;B&gt; &gt;
</pre>
<p>As usual, if one or more operand of the expression do not expose any attribute (expose <span style="font-family: Courier New;">unused_type</span> as their attribute, which is equivalent), this operand disappears from attribute handling:</p>
<pre class="brush: cpp; title: ; notranslate">
a: A, b: Unused --&gt; (a ^ b): optional&lt;A&gt;;
</pre>
<p>The permutation parser works out of the box whenever you do not require to match all of the elements in the input. But what if you want strict permutation (operands get matched exactly once)? You have two possibilities, as often, one simple and less versatile and one more complex but universally applicable solution. The simple solution is to parse the input and to check afterward whether all optionals in the resulting attribute have been filled. I will leave that solution as an exercise for the reader.</p>
<p>If we assume the attribute to be a (<em>Fusion</em>) tuple of optionals, containing one optional for each of the parser components in the permutation parser we can write the following code (thanks to Carl Barron for the initial idea).</p>
<p>This code defines a <em>Phoenix</em> function (a lazy function encapsulating some custom functionality) checking whether one or more of the optionals in a given <em>Fusion</em> sequence are empty. The <em>Fusion</em> algorithm <span style="font-family: Courier New;">find_if</span> iterates over the given sequence of optionals, invoking the <span style="font-family: Courier New;">option_empty::operator()</span> for each of the elements. <span style="font-family: Courier New;">fusion::find_if</span> stops iterating on the first invocation returning <span style="font-family: Courier New;">true</span> and returns the iterator to the element it stopped on. This is very similar to the well known <span style="font-family: Courier New;">std::find_if</span> algorithm.</p>
<pre class="brush: cpp; title: ; notranslate">
namespace phoenix = boost::phoenix;
namespace fusion = boost::fusion;
namespace qi = boost::spirit::qi;

class no_empties_impl
{
    // helper function object to be invoked by fusion::find_if
    struct optional_empty
    {
        template &lt;typename T&gt;
        bool operator ()(T const&amp; val) const
        {
            return !val;  // return true if 'val' is empty.
        }
    };

public:
    template &lt;typename T&gt;
    struct result { typedef bool type; };

    // This operator will get called from the semantic action attached
    // to the permutation parser. The parameter refers to its overall
    // attribute: the fusion tuple of optionals.
    template &lt;typename T&gt;
    bool operator ()(T const&amp; t) const
    {
        // look for an empty optional, if any return false.
        return fusion::find_if&lt;optional_empty&gt;(t) ==
               fusion::end(t);
    }
};

// define the Phoenix function
phoenix::function&lt;no_empties_impl&gt; const no_empties = no_empties_impl();
</pre>
<p>The overall Phoenix function <span style="font-family: Courier New;">no_empties</span> will return <span style="font-family: Courier New;">false</span> if we found at least one non-initialized optional in the passed sequence. The following code snippet illustrates how everything fits together:</p>
<pre class="brush: cpp; title: ; notranslate">
std::string input (&quot;BCA&quot;);
std::string::const_iterator begin = input.begin();
std::string::const_iterator end = input.end();
qi::parse(begin, end,
    (qi::char_('A') ^ 'B' ^ 'C')[qi::_pass = no_empties(qi::_0)]);
</pre>
<p>We assign the result of the invocation of <span style="font-family: Courier New;">no_empties</span> to Qi&#8217;s predefined placeholder <span style="font-family: Courier New;">_pass</span>. If we assign <span style="font-family: Courier New;">false</span>, then the parser the semantic action is attached to will be forced to fail in retrospective (even if it matched the input successfully before). As a result the overall parser expression will succeed as long as a) the permutation parser matches its input and b) the <em>Phoenix</em> function inside the semantic action returns <span style="font-family: Courier New;">true</span>.</p>
<p>For more information about the permutation parser please consult its documentation <a title="Permutation parser documentation" href="http://www.boost.org/doc/libs/1_41_0/libs/spirit/doc/html/spirit/qi/reference/operator/permutation.html">here</a>. Overall, this example is a bit more complex than the average parser you might usually write. It utilizes three libraries: <em>Spirit</em>, <em>Phoenix</em>, and <em>Fusion</em> in a seamless manner. But for sure, once you understand the idea, it will be easier for you to come up with similar solutions. <em>Spirit</em> has been designed with <em>Phoenix</em> and <em>Fusion</em> in mind, and in fact it relies on <em>Fusion</em> heavily itself. As a result, the integration of those libraries is almost perfect.</p>
<br /><div><img src="http://boost-spirit.com/home/wp-content/plugins/gd-star-rating/gfx.php?value=5.0" /></div><div>Rating: 5.0/<strong>5</strong> (1 vote cast)</div><br />]]></content:encoded>
			<wfw:commentRss>http://boost-spirit.com/home/2010/02/17/parsing-arbitrary-things-in-any-sequence/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>How to Adapt Templates as a Fusion Sequence</title>
		<link>http://boost-spirit.com/home/2010/02/08/how-to-adapt-templates-as-a-fusion-sequence/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=how-to-adapt-templates-as-a-fusion-sequence</link>
		<comments>http://boost-spirit.com/home/2010/02/08/how-to-adapt-templates-as-a-fusion-sequence/#comments</comments>
		<pubDate>Mon, 08 Feb 2010 15:59:13 +0000</pubDate>
		<dc:creator>Hartmut Kaiser</dc:creator>
				<category><![CDATA[Intermediate]]></category>
		<category><![CDATA[Qi Example]]></category>
		<category><![CDATA[Tip of the Day]]></category>
		<category><![CDATA[Qi]]></category>

		<guid isPermaLink="false">http://boost-spirit.com/home/?p=971</guid>
		<description><![CDATA[Here is another question raised from time to time: &#8220;I know how to use a plain struct as an attribute for a sequence parser in Qi by adapting it with BOOST_FUSION_ADAPT_STRUCT. Unfortunately this does not work if the struct is a template. What can I do in this case?&#8221;. There have been plans for a while [...]<br /><div><img src="http://boost-spirit.com/home/wp-content/plugins/gd-star-rating/gfx.php?value=5.0" /></div><div>Rating: 5.0/<strong>5</strong> (2 votes cast)</div><br />]]></description>
			<content:encoded><![CDATA[<p>Here is another question raised from time to time: &#8220;I know how to use a plain <span style="font-family: Courier New;">struct</span> as an attribute for a sequence parser in <em>Qi</em> by adapting it with <span style="font-family: Courier New;">BOOST_FUSION_ADAPT_STRUCT</span>. Unfortunately this does not work if the <span style="font-family: Courier New;">struct</span> is a template. What can I do in this case?&#8221;.</p>
<p>There have been plans for a while to create a separate Fusion facility <span style="font-family: Courier New;">BOOST_FUSION_ADAPT_TPL_STRUCT</span> allowing to adapt templated data types, but this is not in place yet. Today I will describe a trick you can apply to adapt your templates into &#8216;proper&#8217; Fusion sequences anyway.</p>
<p><span id="more-971"></span></p>
<p>We will use the fact that a <em>Qi</em> grammar is already a template in most cases, and even if it is not a template yet, it can be easily converted into one. Further we will use the built-in capability of rule&#8217;s to invoke a custom attribute transformation if the attribute type of the right hand side does not exactly match the left hand side&#8217;s attribute type.</p>
<p>Let us assume this to be our data structure we want to fill while parsing:</p>
<pre class="brush: cpp; title: ; notranslate">
template &lt;typename A, typename B&gt;
struct data
{
    A a;
    B b;
};
</pre>
<p>We would like to be able to directly utilize this template type as an attribute for our grammar. A possible way of adapting the template type to make it usable as a Fusion sequence is to define a <span style="font-family: Courier New;">fusion::vector&lt;A&amp;, B&amp;&gt;</span> and initialize it with the references to the data members of our template type. If we the pass this Fusion vector as the attribute to the actual parser expression we effectively supply our original data members as the attributes to the parsing process.</p>
<pre class="brush: cpp; title: ; notranslate">
namespace qi = boost::spirit::qi;
namespace fusion = boost::fusion;

template &lt;typename Iterator, typename A, typename B&gt;
struct data_grammar : qi::grammar&lt;Iterator, data&lt;A, B&gt;()&gt;
{
    data_grammar() : data_grammar::base_type(start)
    {
        // the implicit attribute transformation 'adapts' data&lt;&gt; to
        // the Fusion vector
        start = real_start;

        // do the actual parsing of the data&lt;&gt; members
        real_start = qi::auto_ &gt;&gt; ',' &gt;&gt; qi::auto_;
    }

    qi::rule&lt;Iterator, data&lt;A, B&gt;()&gt; start;
    qi::rule&lt;Iterator, fusion::vector&lt;A&amp;, B&amp;&gt;()&gt; real_start;
};
</pre>
<p>The signature of the grammar&#8217;s start rule has to match the signature of the grammar itself. To accommodate for this we introduce a second rule &#8216;real_start&#8217; dedicated to the parsing of our data members. At the same time this allows us to inject the needed transformation of our <span style="font-family: Courier New;">data&lt;&gt;</span> attribute to the Fusion vector. As the left hand side&#8217;s and right hand side&#8217;s attribute types do not match, the parser expression <span style="font-family: Courier New;">start = real_start</span> will invoke <em>Spirit&#8217;s</em> customization point <span style="font-family: Courier New;">transform_attribute</span>. But since the default implementation of this customization point does not handle our special data types the way we want, we are required to implement our own specialization:</p>
<pre class="brush: cpp; title: ; notranslate">
namespace boost { namespace spirit { namespace traits
{
    template &lt;typename A, typename B&gt;
    struct transform_attribute&lt;data&lt;A, B&gt;, fusion::vector&lt;A&amp;, B&amp;&gt; &gt;
    {
        typedef fusion::vector&lt;A&amp;, B&amp;&gt; type;
        static type pre(data&lt;A, B&gt;&amp; val) { return type(val.a, val.b); }
        static void post(data&lt;A, B&gt;&amp;, fusion::vector&lt;A&amp;, B&amp;&gt; const&amp;) {}
        static void fail(data&lt;A, B&gt;&amp;) {}
    };
}}}
</pre>
<p>The function <span style="font-family: Courier New;">pre()</span> is called before the right hand side parser expression is invoked. It gets passed the left hand side&#8217;s attribute (the <span style="font-family: Courier New;">data&lt;&gt;</span> instance) and is required to return the attribute to be passed to the rule&#8217;s right hand side expression. The returned Fusion vector is initialized with the references to the data members of our original <span style="font-family: Courier New;">data&lt;&gt;</span> instance. The functions <span style="font-family: Courier New;">post()</span> and <span style="font-family: Courier New;">fail()</span> can be left empty in our case. For more information about this customization point please see the corresponding documentation <a href="http://www.boost.org/doc/libs/1_41_0/libs/spirit/doc/html/spirit/advanced/customize/transform.html">here</a>.</p>
<p>I added a new example to <em>Spirit</em> demonstrating this technique. Currently, it can be accessed from the Boost SVN only (see <a href="http://svn.boost.org/svn/boost/trunk/libs/spirit/example/qi/adapt_template_struct.cpp">adapt_template_struct.cpp</a>), but in the future it will be released as part of <em>Spirit</em>.</p>
<p>Just in case you were wondering: yes, this trick works equally well for <em>Karma</em> generators. The only difference is that the members of the created Fusion vector will have to be constant references instead.</p>
<br /><div><img src="http://boost-spirit.com/home/wp-content/plugins/gd-star-rating/gfx.php?value=5.0" /></div><div>Rating: 5.0/<strong>5</strong> (2 votes cast)</div><br />]]></content:encoded>
			<wfw:commentRss>http://boost-spirit.com/home/2010/02/08/how-to-adapt-templates-as-a-fusion-sequence/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
	</channel>
</rss>

