Semantic Actions

Semantic actions have the form: expression[action]

Ultimately, after having defined our grammar and generated a corresponding parser, we more often than not need to produce some output or do some work besides syntax analysis. Unless of course what we want is merely to check for the conformance of an input with our grammar, which is very seldom the case.

What we need is a mechanism that will instruct the parser on what work should be done as it traverses the grammar while in the process of parsing an input stream. This mechanism is put in place through semantic actions.

Semantic actions may be attached to any expression at any level within the parser hierarchy. An action is a C/C++ function or function object that will be called if a match is found in the particular context where it is attached. The action function may be serve as a hook into the parser and may be used to, for example:

Generate output from the parser (ASTs, for example);
Report warnings or errors; or
Manage symbol tables.

A semantic action can be any free function or function object that is compatible with the conceptual interface:

void f(IteratorT first, IteratorT last);

where first points to the current input and last points to one after the end of the input (identical to STL iterator ranges). A functor should have a member operator() with a compatible signature as above. Iterators pointing to the matching portion of the input are passed into the function/functor.

Example:

void
my_action(char const* first, char const* last)
{
    std::string(first, last);
    std::cout << string << std::endl;
}
rule<> myrule = (a | b | *(c >> d))[&my_action];

The function my_action will be called whenever the expression:

 a | b | *(c >> d)

matches a portion of the input stream upon parsing. Two iterators (first and last) are passed into the function. These iterators point to the start and end, respectively, of the portion of input stream where the match is found.

Here now is our calculator enhanced with semantic actions:

void doInt(char const* str, char const* end)
{
    std::string s(str, end);
    std::cout << s << std::endl;
}
void doAdd(char const*, char const*)    { cout << "ADD\n";  }
void doSubt(char const*, char const*)   { cout << "SUBT\n"; }
void doMult(char const*, char const*)   { cout << "MULT\n"; }
void doDiv(char const*, char const*)    { cout << "DIV\n";  }
rule<> expr, expr1, expr2, group, integer;

integer  = lexeme[ (!(ch_p('+') | '-') >> +digit)[&doInt] ];
group    = '(' >> expr >> ')';
expr1    = integer | group;
expr2    = expr1 >> *(('*' >> expr1)[&doMult] | ('/' >> expr1)[&doDiv]);
expr     = expr2 >> *(('+' >> expr2)[&doAdd] | ('-' >> expr2)[&doSubt]);

Feeding in the expression "(-1 + 2) * (3 + -4)", for example, to the rule expr will produce the expected output:

-1
2
ADD
3
-4
ADD
MULT

which, by the way, is the Reverse Polish Notation (RPN) of the given expression, reminiscent of some primitive calculators and the language Forth.