Mar ’10 03

The concept of Spirit’s semantic actions seems to be easy enough to understand as most people new to the library prefer their usage over applying the built-in attribute propagation rules. That is not surprising. The idea of attaching a function to any point of a grammar which is called whenever the corresponding parser matched is straighforward to grasp. Earlier versions of Spirit required a semantic action to conform to a very specific interface. Today’s semantic actions are more flexible and more powerful. Recently, a couple of people asked questions about them. So I decided dedicating this Tip of the Day to the specifics and the usage model of semantic actions in Spirit Qi.

All three of Spirit’s sub-libraries – Qi, Karma, and Lex – support semantic actions. In each case they are different and have some specifics. Today I will highlight semantic actions in Qi. But I will dedicate later Tips of the Day to semantic actions in Karma and  Lex.

Semantic actions are functions or function objects attached to some specific part of a grammar. In Qi they are invoked after the corresponding parser successfully recognizes a portion of the input. Here the semantic action receives the attribute value of the matching parser.

Semantic Actions – a General View

A semantic action f are attached to a Qi parser p by simply writing:

p[f]

The function (or function object) f has to expose a certain interface allowing Spirit to pass the proper argument types. In the simplest case this can be a global function taking no arguments at all.

void func()
{
    std::cout << "Matched an integer!\n";
}

std::string input("1234");
std::string::const_iterator begin = input.begin();
std::string::const_iterator end = input.end();
qi::parse(begin, end, qi::int_[func]);     // this will call func

Most of the time this is not sufficient as a semantic action is expected to receive the matched attribute value. This is possible by writing:

void func(int attribute)
{
    std::cout << "Matched integer: " << attribute << "\n";
}

The type of the expected parameter (in this case the int) depends on the parser the semantic action is attached to. The attribute type exposed by the parser has to be convertible to the argument type.

There are actually 2 more arguments being passed: the parser context and a reference to a boolean ‘hit’ parameter. The parser context is meaningful only if the semantic action is attached somewhere to the right hand side of a rule. We will see more information about this shortly. The boolean value can be set to false inside the semantic action invalidating the match in retrospective, making the parser fail. Qi allows us to bind a nullary or a single argument function, like above. The other arguments are simply ignored.

It is feasible to bind any function object (such as generated by Boost.Bind or Boost.Lambda) as an semantic action. Even if the documentation shows a couple of examples (see here), I would not recommend using those libraries in this context. For me the preferred method of writing semantic actions is to employ Boost.Phoenix – a companion library bundled with Spirit. It is like Boost.Lambda on steroids, with special custom features that make it easy to integrate semantic actions with Spirit. If your requirements go beyond simple parsing, I suggest that you use this library. All the following examples in this article will use Boost.Phoenix for semantic actions. But whatever method you use, please let me highlight the following:

The three libraries allow you to utilize special placeholders to control parameter placement (_1, _2, etc.). Unfortunately, each of those libraries has it’s own implementation of the placeholders, all in different namespaces. You have to make sure not to mix placeholders with a library they don’t belong to and not to use different libraries while writing a semantic action.

Generally, for Boost.Bind, use ::_1, ::_2, etc. (yes, these placeholders are defined in the global namespace).

For Boost.Lambda use the placeholders defined in the namespace boost::lambda.

For semantic actions written using Boost.Phoenix use the placeholders defined in the namespace boost::spirit. Please note that all existing placeholders for your convenience are also available from the namespace boost::spirit::qi.

The current version of Spirit (V2.2) does not yet support binding a native C++0x lambda function as a semantic action, but this is something we are currently working on. You can expect this to be possible in the near future.

Writing Phoenix based Semantic Actions

Writing a semantic action with Phoenix is beneficial as Spirit  ‘knows’ about Phoenix. If you write them with the help of Phoenix you can utilize special placeholders Spirit provides you with. Those placeholders refer to elements in the context of the current parser execution such as attributes, local variables and inherited attributes of rules, etc. None of the other means of writing semantic actions (using Bind, Lambda, or hand written function objects) gives you direct access to those elements. The following table lists all available placeholders exposed by Spirit (as mentioned earlier, all are defined in the namespace boost::spirit::qi). Again, please note, these are only available inside a semantic action and only if the semantic action is written utilizing Phoenix.

Placeholder Description
_1, _2, ... , _N Nth attribute of the parser p
_pass
Assign false to _pass to force a generator failure.
_val
The enclosing rule’s synthesized attribute.
_r1, _r2, ... , _rN
The enclosing rule’s Nth inherited attribute.
_a, _b, ... , _j
The enclosing rule’s local variables (_a refers to the first).
   

Obviously, the placeholders listed in the last three rows of the table are meaningful only if used in a rule definition. As an example, let us rewrite the semantic action from above with Phoenix:

std::string input("1234");
std::string::const_iterator begin = input.begin();
std::string::const_iterator end = input.end();
qi::parse(begin, end,
    qi::int_
    [
        std::cout << "Matched integer: " << qi::_1 << "\n";
    ]
);

One problem with earlier versions of Spirit (i.e. Spirit.Classic) was that while parsing sequences of things it was difficult to avoid calling a semantic action prematurely. For instance, in a parser sequence of two integer parsers (int_[f1] >> ‘,’ >> int_[f2]) the function f1 got called immediately after the first integer matched, and even if the second integer parser would fail later on. In the current version of Spirit this is not an issue anymore as it is possible to attach a semantic action to the whole sequence while still referring to the single attributes of the different sequence elements:

std::string input("1234,2345");
std::string::const_iterator begin = input.begin();
std::string::const_iterator end = input.end();
qi::parse(begin, end,
    (qi::int_ >> ',' >> qi::int_)
    [
        std::cout << "Matched integers: "
              << qi::_1 << " and " << qi::_2 << "\n";
    ]
);

Here, qi::_1 refers to the attribute matched by the first integer parser, and qi::_2 to the second one.

Initially I was planning to additionally describe the internal interface of a semantic action. Utilizing this interface allows you to write your own function objects and still to get access to the elements of the parser context mentioned above (attributes, the rule’s local variables and inherited attributes, etc.). But this post already got longer as anticipated, which is why I defer this discussion to a second Tip of the Day. Stay tuned!

10 Responses to “The Anatomy of Semantic Actions in Qi”

  1. Rob says:

    What I find always lacking in the documentation and the various posts I’ve found on semantic actions is a clear delineation of what always seems to be my need: calling a member function. For example, if I need to call a member function on an object referenced in my grammar with qi::_1, how do I do that? Ideally, it would be as simple as:

       [foo.f(qi::_1)]
    

    but that won’t work, of course. There are likely several approaches to solve the problem, but I’d sure like to know what the best is. If the underlying data members were accessible as a struct, I could adapt the struct to be a Fusion sequence and Phoenix would apply, but what do I do when that’s not the case?

    • Rob says:

      I should mention that I see the Boost.Bind example in the documentation, but that seems awfully heavy; is that really the best recourse?

    • Hartmut Kaiser says:

      Calling a member function is possible by binding:

      p[phoenix::bind(&foo::f, a_f, qi::_1)]
      

      this is because C++ doesn’t allow to overload the operator ‘dot’.

      You asked about other ways to adapt a class with non-public members. Have you seen the new Fusion adapt interface for abstract data types (BOOST_FUSION_ADAPT_ADT)?

  2. Rob says:

    I’ve tried twice now, to post a somewhat lengthy comment, but they disappeared with no explanation. If you can find them, just keep the second one.

    • Rob says:

      I had tried the post as a reply to yours, but just tried again with a top level reply. It, too, failed. Is there a length limit above which replies are just silently discarded? Is there a spam filter that is swallowing my posts based upon the length or some other criteria? An error message would be preferable to the silence that currently results.

  3. Rob says:

    This post, like everything else I’ve found on semantic actions, defers to something later to describe or just bypasses the context argument passed to a semantic action. Is there any documentation on that argument? I found that I can access “locals” as a tuple of some unknown type, that I could access using fusion::at_c. What else is in it and where can I find it documented?

  4. Latifi says:

    Is there any placeholder for all the input matched, not part by part?
    for example in this grammar:

    start = qi::string(“Hello”) >> qi::char_(‘-’) >> qi::string(“Bye”);

    give us “Hello-Bye”.

Leave a Reply

preload preload preload