Feb 17

Recently, there have been a couple of questions on the Spirit mailing list asking how to parse as set of things known in advance in any sequence and any combination. A simple example would be a list of key/value pairs with known keys but the keys may be ordered in any sequence. This use case seems to be quite common. Fortunately Spirit provides you with a predefined parser component designed for exactly that purpose: the permutation parser.

Spirit’s permutation parser a ^ b matches either a, b, a >> b, or b >> a, where a and b can be arbitrary parser expressions. Just like normal sequences this operator can be utilized to combine more than two operands. For instance, the expression a ^ b ^ c will match a or b or c (or an combination thereof) in any sequence. The attribute propagation rule for the permutation parser is

a: A, b: B --> (a ^ b): tuple<optional<A>, optional<B> >

As usual, if one or more operand of the expression do not expose any attribute (expose unused_type as their attribute, which is equivalent), this operand disappears from attribute handling:

a: A, b: Unused --> (a ^ b): optional<A>;

The permutation parser works out of the box whenever you do not require to match all of the elements in the input. But what if you want strict permutation (operands get matched exactly once)? You have two possibilities, as often, one simple and less versatile and one more complex but universally applicable solution. The simple solution is to parse the input and to check afterward whether all optionals in the resulting attribute have been filled. I will leave that solution as an exercise for the reader.

If we assume the attribute to be a (Fusion) tuple of optionals, containing one optional for each of the parser components in the permutation parser we can write the following code (thanks to Carl Barron for the initial idea).

This code defines a Phoenix function (a lazy function encapsulating some custom functionality) checking whether one or more of the optionals in a given Fusion sequence are empty. The Fusion algorithm find_if iterates over the given sequence of optionals, invoking the option_empty::operator() for each of the elements. fusion::find_if stops iterating on the first invocation returning true and returns the iterator to the element it stopped on. This is very similar to the well known std::find_if algorithm.

namespace phoenix = boost::phoenix;
namespace fusion = boost::fusion;
namespace qi = boost::spirit::qi;

class no_empties_impl
{
    // helper function object to be invoked by fusion::find_if
    struct optional_empty
    {
        template <typename T>
        bool operator ()(T const& val) const
        {
            return !val;  // return true if 'val' is empty.
        }
    };

public:
    template <typename T>
    struct result { typedef bool type; };

    // This operator will get called from the semantic action attached
    // to the permutation parser. The parameter refers to its overall
    // attribute: the fusion tuple of optionals.
    template <typename T>
    bool operator ()(T const& t) const
    {
        // look for an empty optional, if any return false.
        return fusion::find_if<optional_empty>(t) ==
               fusion::end(t);
    }
};

// define the Phoenix function
phoenix::function<no_empties_impl> const no_empties = no_empties_impl();

The overall Phoenix function no_empties will return false if we found at least one non-initialized optional in the passed sequence. The following code snippet illustrates how everything fits together:

std::string input ("BCA");
std::string::const_iterator begin = input.begin();
std::string::const_iterator end = input.end();
qi::parse(begin, end,
    (qi::char_('A') ^ 'B' ^ 'C')[qi::_pass = no_empties(qi::_0)]);

We assign the result of the invocation of no_empties to Qi’s predefined placeholder _pass. If we assign false, then the parser the semantic action is attached to will be forced to fail in retrospective (even if it matched the input successfully before). As a result the overall parser expression will succeed as long as a) the permutation parser matches its input and b) the Phoenix function inside the semantic action returns true.

For more information about the permutation parser please consult its documentation here. Overall, this example is a bit more complex than the average parser you might usually write. It utilizes three libraries: Spirit, Phoenix, and Fusion in a seamless manner. But for sure, once you understand the idea, it will be easier for you to come up with similar solutions. Spirit has been designed with Phoenix and Fusion in mind, and in fact it relies on Fusion heavily itself. As a result, the integration of those libraries is almost perfect.

GD Star Rating
loading...
  • Reddit
  • Facebook
  • del.icio.us
  • Digg
  • Twitter
  • Print

5 Responses to “Parsing Arbitrary Things in Any Sequence”

  1. OvermindDL1 says:

    And of course, certainly there is some way to use symbols with rules stuffed in for much larger groupings too. :)

    GD Star Rating
    loading...
  2. Hicham says:

    This has been working beautifully for me since the initial posts by Carl Barron and ? I put the no_empties phoenix function as a member of my grammar.

    Perhaps there should be an equivalent to the ^ operator which does this by default, ie parsing will fail if not all of the operands are there ?

    GD Star Rating
    loading...
    • Hartmut Kaiser says:

      I’m not sure about that. Such an implementation would probably do exacty what’s described here (as we don’t have any operators available anymore). What do others think about this? Any syntax related suggestions?

      Regards Hartmut

      GD Star Rating
      loading...
  3. Olaf says:

    if I want to compile this (even with headers, main etc.) I’ve got an compile error
    no matching function for call to find_if(const boost::fusion::vector1<boost::optional >&)

    Is the shown syntax correct?

    Thanks,
    Olaf

    GD Star Rating
    loading...
  4. Olaf says:

    with fusion::all(t, optional_empty()) works

    GD Star Rating
    loading...

Leave a Reply

preload preload preload