Closures

Closures provide an environment, a stack frame, for local variables to reside. Most importantly, the closure variables are acessible from the EBNF grammar specification and can be used to pass parser information upstream or downstream from the topmost rule down to the terminals in a top-down recursive descent. For one, closures provide a parser with the ability to have inherited and synthetic attributes, more or less analogous to return values and function arguments.

When a parser is enclosed in a closure, the closure's local variables are created prior to entering the parse function and destructed after exiting the parse function. These local variables are true local variables that exist on the hardware stack.

Nested Functions

To fully understand the importance of closures, it is best to look at a language such as Pascal which allows nested functions. Since we are dealing with C++, lets us assume for the moment that C++ allows nested functions. Consider the following pseudo C++ code:

    void a()
    {
        int va;
        void b()
        {
            int vb;
            void c()
            {
                int vc;
            }

            c();
        }

        b();
    }

We have three functions a, b and c where c is nested in b and b is nested in a. We also have three variables va, vb and vc. The lifetime of each of these local variables starts when the function where it is declared is entered and ends when the function exits. The scope of a local variable spans all nested functions inside the enclosing function where the variable is declared.

Going downstream from function a to function c, when function a is entered, the variable va will be created in the stack. When function b is entered (called by a), va is very well in scope and is visble in b. At which point a fresh variable, vb, is created on the stack. When function c is entered, both va and vb are visibly in scope, and a fresh local variable vc is created.

Going upstream, vc is not and cannot be visible outside the function c. vc's life has already expired once c exits. The same is true with vb; vb is accessible in function c but not in function a.

Nested Mutually Recursive Rules

Now consider that a, b and c are rules:

    a = b >> *(('+' >> b) | ('-' >> b));
    b = c >> *(('*' >> c) | ('/' >> c));
    c = int_p | '(' >> a >> ')' | ('-' >> c) | ('+' >> c);

We can visualize a, b and c as mutually recursive functions where a calls b, b calls c and c recursively calls a. Now, imagine if a, b and c each has a local variable named value that can be referred to in our grammar by explicit qualification:

    a.value // refer to a's value local variable
    b.value // refer to b's value local variable
    c.value // refer to c's value local variable

Like above, when a is entered, a local variable value is created on the stack. This variable can be referred to by both b and c. Again, when b is called by a, b creates a local variable value. This variable is accessible by c but not by a.

Here now is where the analogy with nested functions end: when c is called, a fresh variable value is created which, as usual, lasts the whole lifetime of c. Pay close attention however that c may call a recursively. When this happens, a may now refer to the local variable of c.

What are Closures?

Interestingly, nested functions demonstrate the flexibility of accessing a local variable exixting in an outer scope. However, Spirit closures (inherited from Phoenix) are more powerful than what the nested function example suggests.

The closure as an object that "closes" over the local variables of a function making them visible and accessible outside the function. What is more interesting is that the closure actually packages a local context (stack frame where some variables reside) and make them available outside the scope in which they actually exist. The information is essentially "captured" by the closure allowing it to be referred to anywhere and anytime, even prior to the actual creation of the variables.

The following diagram depicts the situation where a function A (or rule) exposes its closure and another function B references A's variables through its closure.

The closure as an object that "closes" over the local variables of a function making them visible and accessible outside the function

Of course, function A should be active when A.x is referenced. What this means is that function B is reliant on function A (If B is a nested function of A, this will always be the case). The free form nature of Spirit rules allows access to a closure variable anytime, anywhere. Acessing A.x is equivalent to referring to the topmost stack variable x of function A. If function A is not active when A.x is referenced, a runtime exception will be thrown.

free_closure

The free_closure class can be wrap any parser. Doing so will give a parser its own closure. As an example, let us provide a parser with a closure with two local variables: double x and int y.

First, we need to declare our closure:

    struct my_closure : free_closure<my_closure3, double, int>
    {
        member1 x;
        member2 y;
    };

We subclass a user defined closure struct, my_closure, from free_closure. The parameters to free_closure from left to right are:

1. The user defined my_closure.
2. The type of the first closure member: double.
3. The type of the second closure member: int.

member1 and member2 are indirect links to the actual closure variables. Their indirect types correspond to double and int, respectively.

Now that it's declared, we can instantiate and use it:

    rule<> a, b, c;
    my_closure clos;
    a = clos[b >> c];

clos provides a closure for the parser expression b >> c. The closure variables x and y are accessible as clos.x and clos.y:

    c = real_p[clos.x = arg1];

Using Phoenix, the result of real_p is assigned to clos's closure member x.

The closure member1 is the closure's return value. This return value, like the one returned by real_p, for example, can be used to propagate data up the parser hierarchy or passed to semantic actions.

closure

In most cases, we want our closures to provide local variables for our rules and grammars. To illustrate, consider a rule that parses even palindromes: A parser that can recognize textual mirror images such as "abcddcba". Without closures, this can be achieved by setting up a stack, pushing the first parsed character into the stack, then recursively calling itself. Upon exit, from the recursion, the pushed character from the stack is popped and used to recognize the next parsed character to match. Tedious.

Using closures:

    struct my_closure : spirit::closure<my_closure, char>
    {
        member1 ch;
    };

    rule<scanner<>, my_closure::context_t> rev;
    rev = anychar_p[rev.ch = arg1] >> !rev >> fch_p(rev.ch);

As before, we declare our closure first. This time, we need only one closure member, a char. Take note that since we are providing a closure to a rule, we subclass my_closure from closure rather than from free_closure.

Remember rule and grammar contexts? This is the closure's means to hook into the grammar and rule. The closure has a special parser context that may be used to enable rule and grammar closures.

    rule<scanner<>, my_closure::context_t> rev;

Declares a rule rev with our closure.

    rev = anychar_p[rev.ch = arg1] >> !rev >> fch_p(rev.ch);

Defines our even palindrome parser. anychar_p parses any character and assigns it to rev's sole closure member rev.ch. >> !rev recursively calls itself. fch_p(rev.ch) attempts to recognize rev.ch. fch_p is very similar to ch_p but accepts functors such as closure member accesses (see Parametric Parsers).

The same can be applied to grammars. Example:

    struct my_grammar : public grammar<my_grammar, my_closure::context_t>
    {
        template <typename ScannerT>
        struct definition
        {
            definition(my_grammar const& self)
            {
                /*...*/
            }
        };

        /*...*/
    };