{"id":1416,"date":"2011-04-16T10:27:14","date_gmt":"2011-04-16T17:27:14","guid":{"rendered":"http:\/\/boost-spirit.com\/home\/?p=1416"},"modified":"2011-04-16T14:27:03","modified_gmt":"2011-04-16T21:27:03","slug":"the-keyword-parser","status":"publish","type":"post","link":"http:\/\/boost-spirit.com\/home\/2011\/04\/16\/the-keyword-parser\/","title":{"rendered":"The Keyword parser"},"content":{"rendered":"<p>The keyword parser construct has recently been added to <a href=\"https:\/\/svn.boost.org\/svn\/boost\/trunk\/libs\/spirit\/repository\/doc\/html\/index.html\" target=\"_blank\">spirit&#8217;s repository<\/a> (available in 1.47 or from svn) . Here&#8217;s a small introduction to help you get started using the keyword parsers.<\/p>\n<p>Those of you familiar with the <a title=\"Nabialek trick\" href=\"http:\/\/boost-spirit.com\/home\/articles\/qi-example\/nabialek-trick\/\" target=\"_blank\">Nabialek trick <\/a>will recognize it&#8217;s working under the hood. What you can achieve with the keywords parser can also be achieved with the Nabialek trick but not always as elegantly or as efficiently.<\/p>\n<p><!--more--><\/p>\n<p>The two examples presented below are included in the spirit repository and can be found in the folder :<\/p>\n<pre class=\"brush: plain; title: ; notranslate\" title=\"\">libs\/spirit\/repository\/example\/qi<\/pre>\n<h4>Data members marked by keywords (<a href=\"https:\/\/svn.boost.org\/svn\/boost\/trunk\/libs\/spirit\/repository\/example\/qi\/options.cpp\" target=\"_blank\">options.cpp<\/a>)<\/h4>\n<p>For this small introduction we&#8217;ll consider parsing a program command line.<\/p>\n<p>Options are commonly passed to applications delimited by option keywords :<\/p>\n<pre class=\"brush: plain; title: ; notranslate\" title=\"\">\r\n\r\nmySuperCompiler --include includePath --define newSymbol=10 --output output.txt --define newSymbol2=20 --source mySourceFile\r\n\r\n<\/pre>\n<p>The order in which the options are specified doesn&#8217;t matter at all. The task of the parser we are going to write is to extract the individual options into some internal data structure we will use to control the program.<\/p>\n<p>Here are the structures we could use to hold the options passed to our command line :<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\n\/\/ A basic preprocessor symbol&lt;\/pre&gt;\r\ntypedef std::pair&lt;std::string, int32_t&gt; preprocessor_symbol;\r\n\r\nstruct program_options {\r\n   \/\/ symbol container type definition\r\n   typedef std::vector&lt; preprocessor_symbol &gt; preprocessor_symbols_container;&lt;\/pre&gt;\r\n   \/\/ include paths\r\n   std::vector&lt;std::string&gt; includes;\r\n   \/\/ preprocessor symbols\r\n   preprocessor_symbols_container preprocessor_symbols;\r\n   \/\/ output file name\r\n   boost::optional&lt;std::string&gt; output_filename;\r\n   \/\/ input file name\r\n   std::string source_filename;\r\n};\r\n\r\n<\/pre>\n<p>Of course \u00a0the structures are adapted to be compatible with fusion in order to get the data pulled into the structures easily.<\/p>\n<p>Now lets define our options rule:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\n\r\nrule&lt;const char *, program_options(), space_type&gt; kwd_rule;\r\n\r\nkwd_rule %= kwd(&quot;--include&quot;)[\r\n                parse_string\r\n            ]\r\n          \/ kwd(&quot;--define&quot;) [\r\n                parse_string\r\n                &gt;&gt; (\r\n                    (lit('=') &gt; int_) | attr(1)\r\n                   )\r\n            ]\r\n          \/ kwd(&quot;--output&quot;,0,1)[\r\n                parse_string\r\n            ]\r\n          \/ kwd(&quot;--source&quot;,1)[\r\n                parse_string\r\n            ]\r\n          ;\r\n<\/pre>\n<p>The first thing to notice here is that we used the %= operator. This means that the parsing construct we just wrote has an attribute type compatible with the attribute type of our adapted structure!<\/p>\n<p>This is one spot were the keyword parsing construct surpasses the Nabialek trick. The Nabialek trick just can&#8217;t do that.<\/p>\n<p>On the next lines we define our keyword parsing constructs.  Writing<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">kwd(&quot;--include&quot;)[ parse_string ] <\/pre>\n<p>is equivalent to writing:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">lit(&quot;--inlude&quot;) &gt; parse_string<\/pre>\n<p>The word &#8220;&#8211;include&#8221; must be followed by a string.<\/p>\n<p>The k<a title=\"kwd directive\" href=\"http:\/\/svn.boost.org\/svn\/boost\/trunk\/libs\/spirit\/repository\/doc\/html\/spirit_repository\/qi_components\/directives\/kwd.html\">wd directive<\/a> has the ability to be combined by using the <a title=\"Keyword list operator\" href=\"http:\/\/svn.boost.org\/svn\/boost\/trunk\/libs\/spirit\/repository\/doc\/html\/spirit_repository\/qi_components\/operators\/keyword_list.html\">\/ operator<\/a>. The kwd directive and the operator \/ work tightly together to achive the goal of attribute compatibility while using the Nabialek trick.<\/p>\n<p>One last thing to notice is the occurrence constraints which can be associated with a kwd directive. It works like the repeat directive and enables to add additional validation checks inside the keyword parsing loop.<\/p>\n<p>Writing<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">kwd(&quot;--output&quot;,0,1)[ parse_string ] <\/pre>\n<p>means that the keyword &#8220;&#8211;output&#8221; may occur 0 or 1 times at most. If it occurs more than once the parser will fail.<\/p>\n<p>Writing<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">kwd(&quot;--source&quot;,1)[ parse_string ] <\/pre>\n<p>means that the keyword &#8220;&#8211;source&#8221; must occur once and only once. This works just like the repeat directive.<\/p>\n<p>Using occurrence constraints doesn&#8217;t cost much on the runtime performance and gives the ability to easily enforce constraints which would be otherwise way much more difficult to formulate.<\/p>\n<p>The kwd directive also exists in a case insentive variant : ikwd. You can combine the kwd and ikwd freely inside the same keyword block at the cost of a small runtime overhead.<\/p>\n<h4>Derived structures (<a href=\"https:\/\/svn.boost.org\/svn\/boost\/trunk\/libs\/spirit\/repository\/example\/qi\/derived.cpp\" target=\"_blank\">derived.cpp<\/a>)<\/h4>\n<p>A recent post in the mailing list gave me the idea to provide an example of how the keyword parser can be used to produce different derived structures depending on keywords placed in the input.<\/p>\n<p>Here&#8217;s the problem as described by MM:<\/p>\n<p>&#8220;I have a case where I have a prefix string that will distinguish what will follow it.<\/p>\n<pre class=\"brush: plain; title: ; notranslate\" title=\"\">prefix string - struct members<\/pre>\n<p>this is what is read from the input stream.\u00a0I have a base struct and 5 derived D1..D5, each derived has a different prefix as a static const std::string member.\u00a0Parsing the prefix string tells me which struct D1..D5 I should parse after.\u00a0All these derived structs are fusion adapted. There is a rule for each of the derived.&#8221;<\/p>\n<p>To keep the example simple here are the classes we could consider:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\n\r\nstruct base_type {\r\n    base_type(const std::string &amp;name) : name(name)  {}\r\n\r\n    std::string name;\r\n    virtual std::ostream &amp;output(std::ostream &amp;os) const {\r\n        os&lt;&lt;&quot;Base : &quot;&lt;&lt;name;        return os;\r\n    }\r\n};\r\n\r\nstruct derived1 : public base_type {\r\n    derived1(const std::string &amp;name, unsigned int data1) :\r\n        base_type(name)\r\n      , data1(data1)  {}\r\n\r\n    unsigned int data1;\r\n    virtual std::ostream &amp;output(std::ostream &amp;os) const {\r\n        base_type::output(os);\r\n        os&lt;&lt;&quot;, &quot;&lt;&lt;data1;\r\n        return os;\r\n    }\r\n};\r\n\r\nstruct derived2 : public base_type {\r\n    derived2(const std::string &amp;name, unsigned int data2) :\r\n        base_type(name), data2(data2)  {}\r\n\r\n    unsigned int data2;\r\n    virtual std::ostream &amp;output(std::ostream &amp;os) const    {\r\n        base_type::output(os);\r\n        os&lt;&lt;&quot;, &quot;&lt;&lt;data2;\r\n        return os;\r\n    }\r\n};\r\n\r\nstruct derived3 : public derived2 {\r\n    derived3(const std::string &amp;name, unsigned int data2, double data3) :\r\n      derived2(name,data2)\r\n    , data3(data3)\r\n    {}\r\n\r\n    double data3;\r\n    virtual std::ostream &amp;output(std::ostream &amp;os) const    {\r\n        derived2::output(os);\r\n        os&lt;&lt;&quot;, &quot;&lt;&lt;data3;\r\n        return os;\r\n    }\r\n};\r\n\r\n<\/pre>\n<p>Our parse result must be a vector of pointers to our base class:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">std::vector&lt;base_type*&gt;<\/pre>\n<p>To get that done, we&#8217;ll use semantic actions inside the kwd directive:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\n\r\nkwd_rule = kwd(&quot;derived1&quot;)[\r\n              ('=' &gt; parse_string &gt; int_ )\r\n              [phx::push_back(_val,phx::new_&lt;derived1&gt;(_1,_2))]\r\n           ]\r\n         \/ kwd(&quot;derived2&quot;)[\r\n              ('=' &gt; parse_string &gt; int_ )\r\n              [phx::push_back(_val,phx::new_&lt;derived2&gt;(_1,_2))]\r\n           ]\r\n         \/ kwd(&quot;derived3&quot;)[\r\n              ('=' &gt; parse_string &gt; int_ &gt; double_)\r\n              [phx::push_back(_val,phx::new_&lt;derived3&gt;(_1,_2,_3))]\r\n           ]\r\n           ;\r\n\r\n<\/pre>\n<p>This rule will construct new derived classes and append them to our result vector during parsing. The input parsed by this construct is of the form:<\/p>\n<pre class=\"brush: plain; title: ; notranslate\" title=\"\"> derived2 = &quot;object1&quot; 10 derived3= &quot;object2&quot; 40 20.0 <\/pre>\n<h4>Keywords vs Nabialek trick<\/h4>\n<p>Here&#8217;s a small table to compare the features of the keyword parsing constructs and the Nabialek trick to help you decide which solution better suits your needs.<\/p>\n\n<table id=\"wp-table-reloaded-id-1-no-1\" class=\"wp-table-reloaded wp-table-reloaded-id-1\">\n<thead>\n\t<tr class=\"row-1 odd\">\n\t\t<th class=\"column-1\"><\/th><th class=\"column-2\">Nabialek trick<\/th><th class=\"column-3\">Keywords parser<\/th>\n\t<\/tr>\n<\/thead>\n<tbody>\n\t<tr class=\"row-2 even\">\n\t\t<td class=\"column-1\">Attribute propagation<\/td><td class=\"column-2\">no<\/td><td class=\"column-3\">yes<\/td>\n\t<\/tr>\n\t<tr class=\"row-3 odd\">\n\t\t<td class=\"column-1\">Runtime modification of the keyword set<\/td><td class=\"column-2\">yes<\/td><td class=\"column-3\">no<\/td>\n\t<\/tr>\n\t<tr class=\"row-4 even\">\n\t\t<td class=\"column-1\">Occurrence constraints<\/td><td class=\"column-2\">not easily implented<\/td><td class=\"column-3\">yes<\/td>\n\t<\/tr>\n\t<tr class=\"row-5 odd\">\n\t\t<td class=\"column-1\">Number of keyword limit<\/td><td class=\"column-2\">available runtime memory<\/td><td class=\"column-3\">BOOST_VARIANT_LIMIT_TYPES<\/td>\n\t<\/tr>\n<\/tbody>\n<\/table>\n\n<p>The keywords parsing construct can save a lot of typing over the <a title=\"Nabialek trick\" href=\"http:\/\/boost-spirit.com\/home\/articles\/qi-example\/nabialek-trick\/\" target=\"_blank\">Nabialek trick<\/a> and has in many cases even better performance. It also makes retrieving the parsed data into the program usable structures much easier as it supports attribute propagation. The main limitation of the keyword parser is the number of keywords a keyword block may contain ( limited by the maximum size of the variant type BOOST_VARIANT_LIMIT_TYPES).<\/p>\n<div class=\"sharedaddy sd-sharing-enabled\"><div class=\"robots-nocontent sd-block sd-social sd-social-icon-text sd-sharing\"><h3 class=\"sd-title\">Share this:<\/h3><div class=\"sd-content\"><ul><li><a href=\"#\" class=\"sharing-anchor sd-button share-more\"><span>Share<\/span><\/a><\/li><li class=\"share-end\"><\/li><\/ul><div class=\"sharing-hidden\"><div class=\"inner\" style=\"display: none;\"><ul><li class=\"share-facebook\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"sharing-facebook-1416\" class=\"share-facebook sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2011\/04\/16\/the-keyword-parser\/?share=facebook\" target=\"_blank\" title=\"Click to share on Facebook\" ><span>Facebook<\/span><\/a><\/li><li class=\"share-twitter\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"sharing-twitter-1416\" class=\"share-twitter sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2011\/04\/16\/the-keyword-parser\/?share=twitter\" target=\"_blank\" title=\"Click to share on Twitter\" ><span>Twitter<\/span><\/a><\/li><li class=\"share-end\"><\/li><li class=\"share-pinterest\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"sharing-pinterest-1416\" class=\"share-pinterest sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2011\/04\/16\/the-keyword-parser\/?share=pinterest\" target=\"_blank\" title=\"Click to share on Pinterest\" ><span>Pinterest<\/span><\/a><\/li><li class=\"share-linkedin\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"sharing-linkedin-1416\" class=\"share-linkedin sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2011\/04\/16\/the-keyword-parser\/?share=linkedin\" target=\"_blank\" title=\"Click to share on LinkedIn\" ><span>LinkedIn<\/span><\/a><\/li><li class=\"share-end\"><\/li><li class=\"share-reddit\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"\" class=\"share-reddit sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2011\/04\/16\/the-keyword-parser\/?share=reddit\" target=\"_blank\" title=\"Click to share on Reddit\" ><span>Reddit<\/span><\/a><\/li><li class=\"share-tumblr\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"\" class=\"share-tumblr sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2011\/04\/16\/the-keyword-parser\/?share=tumblr\" target=\"_blank\" title=\"Click to share on Tumblr\" ><span>Tumblr<\/span><\/a><\/li><li class=\"share-end\"><\/li><li class=\"share-end\"><\/li><\/ul><\/div><\/div><\/div><\/div><\/div>","protected":false},"excerpt":{"rendered":"<p>The keyword parser construct has recently been added to spirit&#8217;s repository (available in 1.47 or from svn) . Here&#8217;s a small introduction to help you get started using the keyword parsers. Those of you familiar with the Nabialek trick will recognize it&#8217;s working under the hood. What you can achieve with the keywords parser can [&hellip;]<\/p>\n<div class=\"sharedaddy sd-sharing-enabled\"><div class=\"robots-nocontent sd-block sd-social sd-social-icon-text sd-sharing\"><h3 class=\"sd-title\">Share this:<\/h3><div class=\"sd-content\"><ul><li><a href=\"#\" class=\"sharing-anchor sd-button share-more\"><span>Share<\/span><\/a><\/li><li class=\"share-end\"><\/li><\/ul><div class=\"sharing-hidden\"><div class=\"inner\" style=\"display: none;\"><ul><li class=\"share-facebook\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"sharing-facebook-1416\" class=\"share-facebook sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2011\/04\/16\/the-keyword-parser\/?share=facebook\" target=\"_blank\" title=\"Click to share on Facebook\" ><span>Facebook<\/span><\/a><\/li><li class=\"share-twitter\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"sharing-twitter-1416\" class=\"share-twitter sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2011\/04\/16\/the-keyword-parser\/?share=twitter\" target=\"_blank\" title=\"Click to share on Twitter\" ><span>Twitter<\/span><\/a><\/li><li class=\"share-end\"><\/li><li class=\"share-pinterest\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"sharing-pinterest-1416\" class=\"share-pinterest sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2011\/04\/16\/the-keyword-parser\/?share=pinterest\" target=\"_blank\" title=\"Click to share on Pinterest\" ><span>Pinterest<\/span><\/a><\/li><li class=\"share-linkedin\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"sharing-linkedin-1416\" class=\"share-linkedin sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2011\/04\/16\/the-keyword-parser\/?share=linkedin\" target=\"_blank\" title=\"Click to share on LinkedIn\" ><span>LinkedIn<\/span><\/a><\/li><li class=\"share-end\"><\/li><li class=\"share-reddit\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"\" class=\"share-reddit sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2011\/04\/16\/the-keyword-parser\/?share=reddit\" target=\"_blank\" title=\"Click to share on Reddit\" ><span>Reddit<\/span><\/a><\/li><li class=\"share-tumblr\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"\" class=\"share-tumblr sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2011\/04\/16\/the-keyword-parser\/?share=tumblr\" target=\"_blank\" title=\"Click to share on Tumblr\" ><span>Tumblr<\/span><\/a><\/li><li class=\"share-end\"><\/li><li class=\"share-end\"><\/li><\/ul><\/div><\/div><\/div><\/div><\/div>","protected":false},"author":747,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_s2mail":"yes","spay_email":"","jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true},"categories":[10,5],"tags":[],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/pIHdZ-mQ","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/posts\/1416"}],"collection":[{"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/users\/747"}],"replies":[{"embeddable":true,"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/comments?post=1416"}],"version-history":[{"count":63,"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/posts\/1416\/revisions"}],"predecessor-version":[{"id":1482,"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/posts\/1416\/revisions\/1482"}],"wp:attachment":[{"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/media?parent=1416"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/categories?post=1416"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/tags?post=1416"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}