{"id":1364,"date":"2011-02-28T06:23:06","date_gmt":"2011-02-28T14:23:06","guid":{"rendered":"http:\/\/boost-spirit.com\/home\/?p=1364"},"modified":"2011-03-02T04:19:32","modified_gmt":"2011-03-02T12:19:32","slug":"dispatching-on-expectation-point-failures","status":"publish","type":"post","link":"http:\/\/boost-spirit.com\/home\/2011\/02\/28\/dispatching-on-expectation-point-failures\/","title":{"rendered":"Dispatching on Expectation Point Failures"},"content":{"rendered":"<p>When using expectation points, a parsing failure results in an exception that generically indicates the failure, but probably doesn&#8217;t explain the problem in the most meaningful way. It is possible to attach an error handler to react to the failed match in a more specialized way:<\/p>\n<p><!--more--><\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\nrule = alpha &gt; '!';\r\non_error&lt;fail&gt;(rule,\r\n   std::cerr &lt;&lt; val(&quot;Expected '!' at offset &quot;) &lt;&lt; (_3 - _1)\r\n      &lt;&lt; &quot; in \\&quot; &lt;&lt; std::string(_1, _2) &lt;&lt; '&quot;'\r\n      &lt;&lt; std::endl);\r\n<\/pre>\n<p>That will produce a message like the following on stderr:<\/p>\n<p><code> Expected '!' at offset 7 in \"Some input\"<\/code><\/p>\n<p>However, if there&#8217;s more than one expectation point in a rule, then the  diagnostic may be unhelpfully generic. To do otherwise, one must distinguish which  expectation point failed. While it is certainly possible to factor the  grammar into additional rules in order to have at most one expectation  point per rule, that&#8217;s not necessary and can make the grammar less readable than otherwise. Instead, the <em>what<\/em> parameter (<code>_4<\/code>) of the error handler can be used:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\nrule = alpha &gt; '!';\r\non_error&lt;fail&gt;(rule,\r\n   std::cerr &lt;&lt; val(&quot;Expected &quot; &lt;&lt; _4 &lt;&lt; &quot; at offset &quot;)\r\n      &lt;&lt; (_3 - _1) &lt;&lt; &quot; in \\&quot; &lt;&lt; std::string(_1, _2) &lt;&lt; '&quot;'\r\n      &lt;&lt; std::endl);\r\n<\/pre>\n<p>The <em>what<\/em> parameter describes the failure.  In the case of an expectation point match failure, it is the name of the parser that failed to match or, if the parser is to match literal text, like <code>'!'<\/code> in the preceding example, the <em>what<\/em> parameter will be <code>\"literal-char\"<\/code> or similar. In this case, <code>_4<\/code> will be <code>\"literal-char\"<\/code> (in the form of a boost::spirit::utf8_string which is a specialization of std::basic_string), and thus not terribly useful in a diagnostic.<\/p>\n<p>To make the error message more helpful, and especially in rules with more than one literal parser to distinguish, create distinct, named rules:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\nexclamation = lit('!');\r\nexclamation.name(&quot;!&quot;);\r\nrule = alpha &gt; exclamation;\r\non_error&lt;fail&gt;(rule,\r\n   std::cerr &lt;&lt; val(&quot;Expected &quot;) &lt;&lt; _4 &lt;&lt; &quot; at offset &quot;\r\n      &lt;&lt; (_3 - _1) &lt;&lt; &quot; in \\&quot; &lt;&lt; std::string(_1, _2) &lt;&lt; '&quot;'\r\n      &lt;&lt; std::endl);\r\n<\/pre>\n<p>This will report <code>Expected ! at offset...<\/code> when the exclamation rule fails to match.<\/p>\n<p>Since an expectation point failure is distinguished by the <em>what<\/em> parameter, it follows that the <em>what<\/em> parameter can be used to dispatch to different behavior in the error handler based upon which expectation point failed to match. Doing so can be as simple as passing the <em>what<\/em> parameter to an error handling function which can use normal C++ techniques for dispatch such as cascading if-else&#8217;s or a map lookup, using the <em>what<\/em> string as the key to find a function to call. However, Phoenix offers power to do that work within the context of the <code>on_error()<\/code> call:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\nsemicolon = lit(';');\r\nsemicolon.name(&quot;;&quot;);\r\nrule = alpha &gt; semicolon &gt; alpha;\r\non_error&lt;fail&gt;(rule,\r\n   let(_a = bind(&amp;boost::spirit::info::tag, _4))\r\n   [\r\n      if_(&quot;;&quot; == _a)\r\n      [\r\n         report_missing(_4, _1, _2, _3)\r\n      ]\r\n      .else_\r\n      [\r\n         if_(&quot;alpha&quot; == _a)\r\n         [\r\n            report_missing(&quot;second word&quot;, _1, _2, _3)\r\n         ]\r\n         .else_\r\n         [\r\n            report_error(_4, _1, _2, _3)\r\n         ]\r\n      ]\r\n   ]);\r\n<\/pre>\n<p>For the last example to compile, a number of include and using directives are necessary beyond the basics you are probably accustomed to seeing:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\n#include &lt;boost\/spirit\/home\/phoenix\/bind\/bind_member_variable.hpp&gt;\r\n#include &lt;boost\/spirit\/home\/phoenix\/scope\/let.hpp&gt;\r\n#include &lt;boost\/spirit\/home\/phoenix\/scope\/local_variable.hpp&gt;\r\n#include &lt;boost\/spirit\/home\/phoenix\/statement\/if.hpp&gt;\r\nusing boost::phoenix::local_names;\r\n<\/pre>\n<p>It would seem, at first blush, that comparing to <code>_4<\/code> directly should work, but it doesn&#8217;t because <code>_4<\/code> is a Phoenix actor. Instead, a string type is needed to support the comparisons against the string literals for dispatching. In this example, a local Phoenix variable, <code>_a<\/code> is declared and assigned the result of binding <code>_4<\/code> to boost::spirit::info::tag, the field of the boost::spirit::info struct that contains the <em>what<\/em> string. Thus, <code>_a<\/code> is a variable local to the error handler that is bound to the boost::spirit::utf8_string that describes the error and supports comparisons. Note the use of Phoenix&#8217;s <code>let<\/code> construct to declare a local variable scope. (This <code>_a<\/code>, which is <code>boost::phoenix::local_names::_a<\/code>, can be ambiguous with <code>boost::spirit::qi::_a<\/code>, depending upon using directives and declarations.)<\/p>\n<p>The two functions, <code>report_missing()<\/code> and <code>report_error()<\/code> are not defined here, but presumably would report on stderr or raise an exception to indicate that a parsing error occurred, and would report the error context from the input range <code>[_1,_2)<\/code> and would note the error location, within that range, as given by <code>_3<\/code>.<\/p>\n<p>When dispatching in this manner, there can be other parsing errors besides expectation point match failures, hence the final <code>.else_<\/code> branch in the example error handler. For lack of a better response, the example just reports a generic error message that includes the <em>what<\/em> parameter&#8217;s text to give some sort of explanation. A real world rule would possibly provide a more context-specific diagnostic.<\/p>\n<p>A final caution regarding this technique: the compile time, maintenance burden, and code size increases with each additional expectation point to be handled. Using a map-based dispatch may well be better when the number of expectation points grows. However, the diagnostic text generation may get out of synchronization with the point in the grammar triggering it because of their being located in different parts of the code.<\/p>\n<p>There is another way to keep the diagnostic text near the rule triggering an error, while avoiding a great deal of code within the grammar. It involves collecting the rule name and corresponding diagnostic in a structure stored in an array that is then passed to an error handler that uses the <em>what<\/em> parameter to select a diagnostic from the array. If that was as clear as mud, don&#8217;t worry. The code should make it clear. Let&#8217;s start with the rule name to diagnostic mapping which combines the structure and array within a class template:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\ntemplate &lt;size_t N&gt;\r\nclass diagnostics\r\n{\r\npublic:\r\n   diagnostics();\r\n\r\n   \/\/ Adds a tag and diagnostic message pair to self.\r\n   void\r\n   add(char const * _tag, char const * _diagnostic);\r\n\r\n   \/\/ Returns the diagnostic, if any, for _tag.\r\n   char const *\r\n   operator [](char const * _tag) const;\r\n\r\nprivate:\r\n   struct entry\r\n   {\r\n      char const * tag;\r\n      char const * diagnostic;\r\n   };\r\n\r\n   entry  entries_[N];\r\n   size_t size_;\r\n};\r\n<\/pre>\n<p>diagnostics, as written, simply saves pointers to string literals. For more flexibility, it could store real strings (std::basic_string&lt;&gt;s, for instance), but this design is useful and simpler for exposition. To use diagnostics, one must create a grammar data member for each rule that will use it, and then populate it as needed by the rule:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\nsemicolon = lit(';');\r\nsemicolon.name(&quot;;&quot;);\r\nrule = alpha &gt; semicolon &gt; alpha;\r\ndiags.add(&quot;;&quot;, &quot;Missing semicolon after first word&quot;);\r\ndiags.add(&quot;alpha&quot;, &quot;Missing second word&quot;);\r\non_error&lt;fail&gt;(rule,\r\n   error_handler(ref(diags), _1, _2, _3, _4));\r\n<\/pre>\n<p>Notice how the first expectation point is identified by a named rule for the required semicolon, which will produce an error message or exception containing the diagnostic text <code>\"Missing semicolon after first word\"<\/code>. Similarly, if there is no word after a semicolon, then the diagnostic <code>\"Missing second word\"<\/code> will be used because the second alpha will fail to match. In each case, the expectation is that the error handler will use <code>_4<\/code> to indicate which rule fail to satisfy an expectation point.<\/p>\n<p>To round out this example, here&#8217;s how <code>error_handler()<\/code> might look:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\nstruct error_handler_impl\r\n{\r\n   template &lt;class, class, class, class, class&gt;\r\n   struct result { typedef void type; };\r\n\r\n   template &lt;class D, class B, class E, class W, class I&gt;\r\n   void\r\n   operator ()(D const &amp; _diagnostics, B _begin, E _end,\r\n      W _where, I const &amp; _info) const\r\n   {\r\n      utf8_string const &amp; tag(_info.tag);\r\n      char const * const what(tag.c_str());\r\n      char const * diagnostic(_diagnostics[what]);\r\n      std::string scratch;\r\n      if (!diagnostic)\r\n      {\r\n         scratch.reserve(25 + tag.length());\r\n         scratch = &quot;Invalid syntax: expected &quot;;\r\n         scratch += tag;\r\n         diagnostic = scratch.c_str();\r\n      }\r\n      raise_parsing_error(diagnostic, _begin, _end,\r\n         _where);\r\n   }\r\n};\r\nphx::function&lt;error_handler_impl&gt; error_handler;\r\n<\/pre>\n<p>You&#8217;re probably wondering where the implementation of diagnostics&#8217; member functions are to be found. Here they are:<\/p>\n<pre class=\"brush: cpp; title: ; notranslate\" title=\"\">\r\ntemplate &lt;size_t N&gt;\r\ninline\r\ndiagnostics&lt;N&gt;::diagnostics()\r\n   : size_(0)\r\n{\r\n}\r\n\r\ntemplate &lt;size_t N&gt;\r\nvoid\r\ndiagnostics&lt;N&gt;::add(char const * const _tag,\r\n   char const * const _diagnostic)\r\n{\r\n   assert(size_ &lt; N);\r\n   entry &amp; e(entries_[size_++]);\r\n   e.tag = _tag;\r\n   e.diagnostic = _diagnostic;\r\n}\r\n\r\ntemplate &lt;size_t N&gt;\r\nchar const *\r\ndiagnostics&lt;N&gt;::operator [](char const * const _tag) const\r\n{\r\n   for (size_t i(0); i &lt; size_; ++i)\r\n   {\r\n      entry const &amp; e(entries_[i]);\r\n      if (0 == std::strcmp(e.tag, _tag))\r\n      {\r\n         return e.diagnostic;\r\n      }\r\n   }\r\n   return 0;\r\n}\r\n<\/pre>\n<p>It should now be apparent that there are numerous ways to dispatch error handling when using expectation points, but all revolve around decoding the <em>what<\/em> parameter. In the end, factor your grammar to be functional and readable and then consider which expectation point failure dispatching technique fits best without sacrificing readability or performance.<\/p>\n<div class=\"sharedaddy sd-sharing-enabled\"><div class=\"robots-nocontent sd-block sd-social sd-social-icon-text sd-sharing\"><h3 class=\"sd-title\">Share this:<\/h3><div class=\"sd-content\"><ul><li><a href=\"#\" class=\"sharing-anchor sd-button share-more\"><span>Share<\/span><\/a><\/li><li class=\"share-end\"><\/li><\/ul><div class=\"sharing-hidden\"><div class=\"inner\" style=\"display: none;\"><ul><li class=\"share-facebook\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"sharing-facebook-1364\" class=\"share-facebook sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2011\/02\/28\/dispatching-on-expectation-point-failures\/?share=facebook\" target=\"_blank\" title=\"Click to share on Facebook\" ><span>Facebook<\/span><\/a><\/li><li class=\"share-twitter\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"sharing-twitter-1364\" class=\"share-twitter sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2011\/02\/28\/dispatching-on-expectation-point-failures\/?share=twitter\" target=\"_blank\" title=\"Click to share on Twitter\" ><span>Twitter<\/span><\/a><\/li><li class=\"share-end\"><\/li><li class=\"share-pinterest\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"sharing-pinterest-1364\" class=\"share-pinterest sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2011\/02\/28\/dispatching-on-expectation-point-failures\/?share=pinterest\" target=\"_blank\" title=\"Click to share on Pinterest\" ><span>Pinterest<\/span><\/a><\/li><li class=\"share-linkedin\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"sharing-linkedin-1364\" class=\"share-linkedin sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2011\/02\/28\/dispatching-on-expectation-point-failures\/?share=linkedin\" target=\"_blank\" title=\"Click to share on LinkedIn\" ><span>LinkedIn<\/span><\/a><\/li><li class=\"share-end\"><\/li><li class=\"share-reddit\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"\" class=\"share-reddit sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2011\/02\/28\/dispatching-on-expectation-point-failures\/?share=reddit\" target=\"_blank\" title=\"Click to share on Reddit\" ><span>Reddit<\/span><\/a><\/li><li class=\"share-tumblr\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"\" class=\"share-tumblr sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2011\/02\/28\/dispatching-on-expectation-point-failures\/?share=tumblr\" target=\"_blank\" title=\"Click to share on Tumblr\" ><span>Tumblr<\/span><\/a><\/li><li class=\"share-end\"><\/li><li class=\"share-end\"><\/li><\/ul><\/div><\/div><\/div><\/div><\/div>","protected":false},"excerpt":{"rendered":"<p>When using expectation points, a parsing failure results in an exception that generically indicates the failure, but probably doesn&#8217;t explain the problem in the most meaningful way. It is possible to attach an error handler to react to the failed match in a more specialized way:<\/p>\n<div class=\"sharedaddy sd-sharing-enabled\"><div class=\"robots-nocontent sd-block sd-social sd-social-icon-text sd-sharing\"><h3 class=\"sd-title\">Share this:<\/h3><div class=\"sd-content\"><ul><li><a href=\"#\" class=\"sharing-anchor sd-button share-more\"><span>Share<\/span><\/a><\/li><li class=\"share-end\"><\/li><\/ul><div class=\"sharing-hidden\"><div class=\"inner\" style=\"display: none;\"><ul><li class=\"share-facebook\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"sharing-facebook-1364\" class=\"share-facebook sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2011\/02\/28\/dispatching-on-expectation-point-failures\/?share=facebook\" target=\"_blank\" title=\"Click to share on Facebook\" ><span>Facebook<\/span><\/a><\/li><li class=\"share-twitter\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"sharing-twitter-1364\" class=\"share-twitter sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2011\/02\/28\/dispatching-on-expectation-point-failures\/?share=twitter\" target=\"_blank\" title=\"Click to share on Twitter\" ><span>Twitter<\/span><\/a><\/li><li class=\"share-end\"><\/li><li class=\"share-pinterest\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"sharing-pinterest-1364\" class=\"share-pinterest sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2011\/02\/28\/dispatching-on-expectation-point-failures\/?share=pinterest\" target=\"_blank\" title=\"Click to share on Pinterest\" ><span>Pinterest<\/span><\/a><\/li><li class=\"share-linkedin\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"sharing-linkedin-1364\" class=\"share-linkedin sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2011\/02\/28\/dispatching-on-expectation-point-failures\/?share=linkedin\" target=\"_blank\" title=\"Click to share on LinkedIn\" ><span>LinkedIn<\/span><\/a><\/li><li class=\"share-end\"><\/li><li class=\"share-reddit\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"\" class=\"share-reddit sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2011\/02\/28\/dispatching-on-expectation-point-failures\/?share=reddit\" target=\"_blank\" title=\"Click to share on Reddit\" ><span>Reddit<\/span><\/a><\/li><li class=\"share-tumblr\"><a rel=\"nofollow noopener noreferrer\" data-shared=\"\" class=\"share-tumblr sd-button share-icon\" href=\"http:\/\/boost-spirit.com\/home\/2011\/02\/28\/dispatching-on-expectation-point-failures\/?share=tumblr\" target=\"_blank\" title=\"Click to share on Tumblr\" ><span>Tumblr<\/span><\/a><\/li><li class=\"share-end\"><\/li><li class=\"share-end\"><\/li><\/ul><\/div><\/div><\/div><\/div><\/div>","protected":false},"author":717,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_s2mail":"yes","spay_email":"","jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true},"categories":[20,5,3,17],"tags":[],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/pIHdZ-m0","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/posts\/1364"}],"collection":[{"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/users\/717"}],"replies":[{"embeddable":true,"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/comments?post=1364"}],"version-history":[{"count":20,"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/posts\/1364\/revisions"}],"predecessor-version":[{"id":1376,"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/posts\/1364\/revisions\/1376"}],"wp:attachment":[{"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/media?parent=1364"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/categories?post=1364"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/boost-spirit.com\/home\/wp-json\/wp\/v2\/tags?post=1364"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}