{"id":977,"date":"2010-02-17T07:45:25","date_gmt":"2010-02-17T15:45:25","guid":{"rendered":"http:\/\/boost-spirit.com\/home\/?p=977"},"modified":"2010-02-17T13:30:16","modified_gmt":"2010-02-17T21:30:16","slug":"parsing-arbitrary-things-in-any-sequence","status":"publish","type":"post","link":"http:\/\/boost-spirit.com\/home\/2010\/02\/17\/parsing-arbitrary-things-in-any-sequence\/","title":{"rendered":"Parsing Arbitrary Things in Any Sequence"},"content":{"rendered":"<p>Recently, there have been a couple of questions on the <em><a href=\"http:\/\/boost-spirit.com\/home\/info\/mailing-list\/\">Spirit mailing list<\/a><\/em> asking how to parse as set of things known in advance in any sequence and any combination. A simple example would be a list of key\/value pairs with known keys but the keys may be ordered in any sequence. This use case seems to be quite common. Fortunately Spirit provides you with a predefined parser component designed for exactly that purpose: the permutation parser.<\/p>\n<p><!--more--><\/p>\n<p><em>Spirit&#8217;s<\/em> permutation parser <span style=\"font-family: Courier New;\">a ^ b<\/span> matches either <span style=\"font-family: Courier New;\">a<\/span>, <span style=\"font-family: Courier New;\">b<\/span>, <span style=\"font-family: Courier New;\">a &gt;&gt; b<\/span>, or <span style=\"font-family: Courier New;\">b &gt;&gt; a<\/span>, where <span style=\"font-family: Courier New;\">a<\/span> and <span style=\"font-family: Courier New;\">b<\/span> can be arbitrary parser expressions. Just like normal sequences this operator can be utilized to combine more than two operands. For instance, the expression <span style=\"font-family: Courier New;\">a ^ b ^ c<\/span> will match <span style=\"font-family: Courier New;\">a<\/span> or <span style=\"font-family: Courier New;\">b<\/span> or <span style=\"font-family: Courier New;\">c<\/span> (or an combination thereof) in any sequence. The attribute propagation rule for the permutation parser is<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\na: A, b: B --&gt; (a ^ b): tuple&lt;optional&lt;A&gt;, optional&lt;B&gt; &gt;\r\n<\/pre>\n<p>As usual, if one or more operand of the expression do not expose any attribute (expose <span style=\"font-family: Courier New;\">unused_type<\/span> as their attribute, which is equivalent), this operand disappears from attribute handling:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\na: A, b: Unused --&gt; (a ^ b): optional&lt;A&gt;;\r\n<\/pre>\n<p>The permutation parser works out of the box whenever you do not require to match all of the elements in the input. But what if you want strict permutation (operands get matched exactly once)? You have two possibilities, as often, one simple and less versatile and one more complex but universally applicable solution. The simple solution is to parse the input and to check afterward whether all optionals in the resulting attribute have been filled.\u00a0I will leave that solution\u00a0as an exercise for the reader.<\/p>\n<p>If we assume the attribute to be a (<em>Fusion<\/em>) tuple of optionals, containing one optional for each of the parser components in the permutation parser we can write the following code (thanks to Carl Barron for the initial idea).<\/p>\n<p>This code defines a <em>Phoenix<\/em> function (a lazy function encapsulating some custom functionality) checking whether one or more of the optionals in a given <em>Fusion<\/em> sequence are empty. The <em>Fusion<\/em> algorithm <span style=\"font-family: Courier New;\">find_if<\/span> iterates over the given sequence of optionals, invoking the <span style=\"font-family: Courier New;\">option_empty::operator()<\/span> for each of the elements. <span style=\"font-family: Courier New;\">fusion::find_if<\/span> stops iterating on the first invocation returning <span style=\"font-family: Courier New;\">true<\/span> and returns the iterator to the element it stopped on. This is very similar to the well known <span style=\"font-family: Courier New;\">std::find_if<\/span> algorithm.<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\nnamespace phoenix = boost::phoenix;\r\nnamespace fusion = boost::fusion;\r\nnamespace qi = boost::spirit::qi;\r\n\r\nclass no_empties_impl\r\n{\r\n    \/\/ helper function object to be invoked by fusion::find_if\r\n    struct optional_empty\r\n    {\r\n        template &lt;typename T&gt;\r\n        bool operator ()(T const&amp; val) const\r\n        {\r\n            return !val;  \/\/ return true if 'val' is empty.\r\n        }\r\n    };\r\n\r\npublic:\r\n    template &lt;typename T&gt;\r\n    struct result { typedef bool type; };\r\n\r\n    \/\/ This operator will get called from the semantic action attached\r\n    \/\/ to the permutation parser. The parameter refers to its overall\r\n    \/\/ attribute: the fusion tuple of optionals.\r\n    template &lt;typename T&gt;\r\n    bool operator ()(T const&amp; t) const\r\n    {\r\n        \/\/ look for an empty optional, if any return false.\r\n        return fusion::find_if&lt;optional_empty&gt;(t) ==\r\n               fusion::end(t);\r\n    }\r\n};\r\n\r\n\/\/ define the Phoenix function\r\nphoenix::function&lt;no_empties_impl&gt; const no_empties = no_empties_impl();\r\n<\/pre>\n<p>The overall Phoenix function <span style=\"font-family: Courier New;\">no_empties<\/span> will return <span style=\"font-family: Courier New;\">false<\/span> if we found at least one non-initialized optional in the passed sequence. The following code snippet illustrates how everything fits together:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\nstd::string input (&quot;BCA&quot;);\r\nstd::string::const_iterator begin = input.begin();\r\nstd::string::const_iterator end = input.end();\r\nqi::parse(begin, end,\r\n    (qi::char_('A') ^ 'B' ^ 'C')[qi::_pass = no_empties(qi::_0)]);\r\n<\/pre>\n<p>We assign the result of the invocation of <span style=\"font-family: Courier New;\">no_empties<\/span> to Qi&#8217;s predefined placeholder <span style=\"font-family: Courier New;\">_pass<\/span>. If we assign <span style=\"font-family: Courier New;\">false<\/span>, then the parser the semantic action is attached to will be forced to fail in retrospective (even if it matched the input successfully before). As a result the overall parser expression will succeed as long as a) the permutation parser matches its input and b) the <em>Phoenix<\/em> function inside the semantic action returns <span style=\"font-family: Courier New;\">true<\/span>.<\/p>\n<p>For more information about the permutation parser please consult its documentation <a title=\"Permutation parser documentation\" href=\"http:\/\/www.boost.org\/doc\/libs\/1_41_0\/libs\/spirit\/doc\/html\/spirit\/qi\/reference\/operator\/permutation.html\">here<\/a>. Overall, this example is a bit more complex than the average parser you might usually write. It utilizes three libraries: <em>Spirit<\/em>, <em>Phoenix<\/em>, and <em>Fusion<\/em> in a seamless manner. But for sure, once you understand the idea, it will be easier for you to come up with similar solutions. <em>Spirit<\/em> has been designed with <em>Phoenix<\/em> and <em>Fusion<\/em> in mind, and in fact it relies on <em>Fusion<\/em> heavily itself. As a result, the integration of those libraries is almost perfect.<\/p>\n<div class=\"sharedaddy sd-sharing-enabled\"><div class=\"robots-nocontent sd-block sd-social sd-social-icon-text sd-sharing\"><h3 class=\"sd-title\">Share this:<\/h3><div class=\"sd-content\"><ul><li><a href=\"#\" class=\"sharing-anchor sd-button share-more\"><span>Share<\/span><\/a><\/li><li class=\"share-end\"><\/li><\/ul><div class=\"sharing-hidden\"><div class=\"inner\" style=\"display: none;\"><ul><li class=\"share-facebook\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"sharing-facebook-977\" class=\"share-facebook sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2010\/02\/17\/parsing-arbitrary-things-in-any-sequence\/?share=facebook\" target=\"_blank\" title=\"Click to share on Facebook\" ><span>Facebook<\/span><\/a><\/li><li class=\"share-twitter\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"sharing-twitter-977\" class=\"share-twitter sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2010\/02\/17\/parsing-arbitrary-things-in-any-sequence\/?share=twitter\" target=\"_blank\" title=\"Click to share on Twitter\" ><span>Twitter<\/span><\/a><\/li><li class=\"share-end\"><\/li><li class=\"share-pinterest\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"sharing-pinterest-977\" class=\"share-pinterest sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2010\/02\/17\/parsing-arbitrary-things-in-any-sequence\/?share=pinterest\" target=\"_blank\" title=\"Click to share on Pinterest\" ><span>Pinterest<\/span><\/a><\/li><li class=\"share-linkedin\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"sharing-linkedin-977\" class=\"share-linkedin sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2010\/02\/17\/parsing-arbitrary-things-in-any-sequence\/?share=linkedin\" target=\"_blank\" title=\"Click to share on LinkedIn\" ><span>LinkedIn<\/span><\/a><\/li><li class=\"share-end\"><\/li><li class=\"share-reddit\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"\" class=\"share-reddit sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2010\/02\/17\/parsing-arbitrary-things-in-any-sequence\/?share=reddit\" target=\"_blank\" title=\"Click to share on Reddit\" ><span>Reddit<\/span><\/a><\/li><li class=\"share-tumblr\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"\" class=\"share-tumblr sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2010\/02\/17\/parsing-arbitrary-things-in-any-sequence\/?share=tumblr\" target=\"_blank\" title=\"Click to share on Tumblr\" ><span>Tumblr<\/span><\/a><\/li><li class=\"share-end\"><\/li><li class=\"share-end\"><\/li><\/ul><\/div><\/div><\/div><\/div><\/div>","protected":false},"excerpt":{"rendered":"<p>Recently, there have been a couple of questions on the Spirit mailing list asking how to parse as set of things known in advance in any sequence and any combination. A simple example would be a list of key\/value pairs with known keys but the keys may be ordered in any sequence. This use case [&hellip;]<\/p>\n<div class=\"sharedaddy sd-sharing-enabled\"><div class=\"robots-nocontent sd-block sd-social sd-social-icon-text sd-sharing\"><h3 class=\"sd-title\">Share this:<\/h3><div class=\"sd-content\"><ul><li><a href=\"#\" class=\"sharing-anchor sd-button share-more\"><span>Share<\/span><\/a><\/li><li class=\"share-end\"><\/li><\/ul><div class=\"sharing-hidden\"><div class=\"inner\" style=\"display: none;\"><ul><li class=\"share-facebook\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"sharing-facebook-977\" class=\"share-facebook sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2010\/02\/17\/parsing-arbitrary-things-in-any-sequence\/?share=facebook\" target=\"_blank\" title=\"Click to share on Facebook\" ><span>Facebook<\/span><\/a><\/li><li class=\"share-twitter\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"sharing-twitter-977\" class=\"share-twitter sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2010\/02\/17\/parsing-arbitrary-things-in-any-sequence\/?share=twitter\" target=\"_blank\" title=\"Click to share on Twitter\" ><span>Twitter<\/span><\/a><\/li><li class=\"share-end\"><\/li><li class=\"share-pinterest\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"sharing-pinterest-977\" class=\"share-pinterest sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2010\/02\/17\/parsing-arbitrary-things-in-any-sequence\/?share=pinterest\" target=\"_blank\" title=\"Click to share on Pinterest\" ><span>Pinterest<\/span><\/a><\/li><li class=\"share-linkedin\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"sharing-linkedin-977\" class=\"share-linkedin sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2010\/02\/17\/parsing-arbitrary-things-in-any-sequence\/?share=linkedin\" target=\"_blank\" title=\"Click to share on LinkedIn\" ><span>LinkedIn<\/span><\/a><\/li><li class=\"share-end\"><\/li><li class=\"share-reddit\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"\" class=\"share-reddit sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2010\/02\/17\/parsing-arbitrary-things-in-any-sequence\/?share=reddit\" target=\"_blank\" title=\"Click to share on Reddit\" ><span>Reddit<\/span><\/a><\/li><li class=\"share-tumblr\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"\" class=\"share-tumblr sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2010\/02\/17\/parsing-arbitrary-things-in-any-sequence\/?share=tumblr\" target=\"_blank\" title=\"Click to share on Tumblr\" ><span>Tumblr<\/span><\/a><\/li><li class=\"share-end\"><\/li><li class=\"share-end\"><\/li><\/ul><\/div><\/div><\/div><\/div><\/div>","protected":false},"author":3,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_s2mail":"","spay_email":"","jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true},"categories":[19,20,5,18],"tags":[8],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/pIHdZ-fL","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/posts\/977"}],"collection":[{"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/comments?post=977"}],"version-history":[{"count":7,"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/posts\/977\/revisions"}],"predecessor-version":[{"id":979,"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/posts\/977\/revisions\/979"}],"wp:attachment":[{"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/media?parent=977"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/categories?post=977"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/tags?post=977"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}