Mar ’10 11

I have a mixed relationship with variant…

I just wrote a parser for S-expressions (that will be the basis of ASTs and intermediate types in my planned “write-a-compiler” article series). The parser itself is easy, but as always, I spent more time on the underlying data structures.

What are S-expressions? S-expressions, also called sexps, are recursive, list based, data structures. Being recursive, they can represent hierarchical information. S-expressions are parenthesized prefix expressions, known for their use in LISP (and its sibling Scheme). Here’s a simple sexp:

(* 2 (+ 3 4))

The sexp above corresponds to this infix expression:

(2 * (3 + 4))

S-expressions are simple and infinitely powerful beasts as evident in applications that use LISP as their scripting language. They can represent code and data. Some people even use S-expressions as a suitable (and terser!) replacement for XML. The in-memory data structures are very easy to use, transform and manipulate, traverse and compile or accumulate results from.

The plan is to use S-expressions as our AST representation and embed a minimal LISP/Scheme interpreter IN the compiler. This implies that along the way, we’ll be building an S-expression parser and a LISP/Scheme interpreter. How cool is that? … We’re talking about scripting the compiler with an interpreter! :-)

I needed a dynamic data type that can represent the S-expressions. I called it utree, short for universal-tree. I want it to be as simple as it can be and fast and tight in memory footprint. Boost variant was simply out of the question (I used it in one early prototypes). For one, it failed a basic requirement (tight memory footprint). The padding and the way it aligns the “what-type” integer member is quite wasteful. It uses a conservative alignment using the worst alignment of the types in the union. Thus if you have a type in there that aligns to 8 bytes, variant requires another 8 bytes just for the type discriminator! Try it out:

struct x { void* a; void* b; void* c; };
/***/
std::cout << sizeof(x) << std::endl;
std::cout << sizeof(boost::variant<x, int, double>) << std::endl;

I get: 12 and 24 respectively (32 bit system).

I ended up with 40 bytes in my initial prototype (using STL containers and variant) and later squeezed that to 24 (minimum). I did away with variant in my latest version and got 16 bytes. In this case, I “stole” unused padding bits from the data to store the discriminator. ¬†¬†With this 16 bytes, I have nil, bool, int, double, string and (double linked) list. The string itself steals memory when it can (i.e. it stores the string in the union when it can and only uses the heap when needed). The string steals as much as it can. So, on 32 bit systems, it can store in-situ as much as 14 bytes. That’s a lot for storing simple strings like symbols and identifiers. On 64 bit systems, you can store a lot more in-situ and minimize heap usage more.

At this point, I feel like writing my own variant type that can do such things (intrusive variant?). Barring the use of Boost.Variant, I needed to write my own data structures (double linked list). I really wanted to use Boost.Intrusive which is quite efficient, but because I had to squeeze my own variant in there, I had to make use of unions which require PODs!

Here’s the work in progress:
http://boost-spirit.com/dl_more/scheme/scheme_v0.2/

Here’s the utree API:


    ///////////////////////////////////////////////////////////////////////////
    // The main utree (Universal Tree) class
    // The utree is a hierarchical, dynamic type that can store:
    //  - a nil
    //  - a bool
    //  - an integer
    //  - a double
    //  - a string (textual or binary)
    //  - a (doubly linked) list of utree
    //  - a reference to a utree
    //
    // The utree has minimal memory footprint. The data structure size is
    // 16 bytes on a 32-bit platform. Being a container of itself, it can
    // represent tree structures.
    ///////////////////////////////////////////////////////////////////////////
    class utree
    {
    public:

        typedef utree value_type;
        typedef detail::list::node_iterator<utree> iterator;
        typedef detail::list::node_iterator<utree const> const_iterator;
        typedef utree& reference;
        typedef utree const& const_reference;
        typedef std::ptrdiff_t difference_type;
        typedef std::size_t size_type;

        struct nil {};

        utree();
        explicit utree(bool b);
        explicit utree(unsigned int i);
        explicit utree(int i);
        explicit utree(double d);
        explicit utree(char const* str);
        explicit utree(char const* str, std::size_t len);
        explicit utree(std::string const& str);
        explicit utree(boost::reference_wrapper<utree> ref);

        utree(utree const& other);
        ~utree();

        utree& operator=(utree const& other);
        utree& operator=(bool b);
        utree& operator=(unsigned int i);
        utree& operator=(int i);
        utree& operator=(double d);
        utree& operator=(char const* s);
        utree& operator=(std::string const& s);
        utree& operator=(boost::reference_wrapper<utree> ref);

        template <typename F>
        typename F::result_type
        static visit(utree const& x, F f);

        template <typename F>
        typename F::result_type
        static visit(utree& x, F f);

        template <typename F>
        typename F::result_type
        static visit(utree const& x, utree const& y, F f);

        template <typename F>
        typename F::result_type
        static visit(utree& x, utree const& y, F f);

        template <typename F>
        typename F::result_type
        static visit(utree const& x, utree& y, F f);

        template <typename F>
        typename F::result_type
        static visit(utree& x, utree& y, F f);

        template <typename T>
        void push_back(T const& val);

        template <typename T>
        void push_front(T const& val);

        template <typename T>
        iterator insert(iterator pos, T const& x);

        template <typename T>
        void insert(iterator pos, std::size_t, T const& x);

        template <typename Iter>
        void insert(iterator pos, Iter first, Iter last);

        template <typename Iter>
        void assign(Iter first, Iter last);

        void clear();
        void pop_front();
        void pop_back();
        iterator erase(iterator pos);
        iterator erase(iterator first, iterator last);

        utree& front();
        utree& back();
        utree const& front() const;
        utree const& back() const;

        utree& operator[](std::size_t i);
        utree const& operator[](std::size_t i) const;

        void swap(utree& other);

        iterator begin();
        iterator end();
        const_iterator begin() const;
        const_iterator end() const;

        bool empty() const;
        std::size_t size() const;
    };

    bool operator==(utree const& a, utree const& b);
    bool operator<(utree const& a, utree const& b);
    bool operator!=(utree const& a, utree const& b);
    bool operator>(utree const& a, utree const& b);
    bool operator<=(utree const& a, utree const& b);
    bool operator>=(utree const& a, utree const& b);

35 Responses to “S-expressions and variant”

  1. Darid says:

    What is an S-expression?
    would you please be kind enough to explain it to a layman like myself.

  2. Larry Evans says:

    You’re storing the discriminator in the padding. Couldn’t the same be
    done by changing boost/variant/variant.hpp:215-216 from:

    which_t which_;
    storage_t storage_;

    to:

    storage_t storage_;
    which_t which_;

    ? IOW, the compiler should be able to figure out how to best fit
    which_ after storage_.

    • Joel de Guzman says:

      Nope. I don’t think so. The padding will be the same either way. Try it out.

      • Larry Evans says:

        OK, but now I can’t figure how you store the discriminant inside the
        padding at the end of your union of POD’s without somehow
        misaligning the discriminant. I guess just saying I wonder why
        the compiler writers couldn’t figure how to do this most efficiently.
        Do you know of some language rules which require this wasted
        space?

        • Joel de Guzman says:

          I did nothing special. I just stole one byte from the fast_string (the string that places its data in-situ) for the discriminator. Check out the implementation of fast_string:

          http://boost-spirit.com/dl_more/scheme/scheme_v0.1/detail/utree_detail1.hpp

          • Would you mind explaining why you need the lbuff field in the fast_string union? I see that it has the size buff_size / (sizeof(long) / sizeof(char)); why that size specifically?

            You seem to use during utree initialization before assigning a value. My first thought was that its purpose is to zero out the memory, but then why would you need an extra union field?

          • Joel de Guzman says:

            Mathias, its purpose is to make sure that 1) we initialize the string to zero 2) that we have a size that’s divisible by sizeof(long). The latter has the advantage of a) optimizing the buffer usage since machine architectures tend to align the structs/unions anyway, and b) having to zero the string in size/sizeof(long) operations instead of size operations.

          • Thanks, I understand (1) and now (2a). For (2b), why didn’t you use memset? My understanding is that is the most efficient way to initialize memory with a constant value.

          • Joel de Guzman says:

            That is the common perception alright, but i’m not so sure about that for this case. memset has to deal with off sizes. Anyway, if I really wanted blazing speed, I’d unroll the loop (since the size is static anyway), but that is not the main intent.

        • Larry Evans says:

          I tried using a union with a discriminant inside a struct. I get the same
          size of for composite_tagged_seq:

          IOW, the following code:

          #include 
          #include 
          #include 
          
          #include 
          
          struct x { void* a; void* b; void* c; };
          
          struct x_union
          {
                union
                { x u_x
                ; int u_int
                ; double u_dbl
                ;}
              storage
              ;
                char
              tag
              ;
          };
          struct x_oneof
          : boost::composite_tagged_seq
            < boost::composite_tags::one_of_maybe
            , boost::mpl::integral_c
            , boost::mpl::list
              
            >
          {
          };
          
          int main(void)
          {
              std::cout<<"sizeof(x)="<<sizeof(x)<<"\n";
              std::cout<<"sizeof(x_union)="<<sizeof(x_union)<<"\n";
              std::cout<<"sizeof(x_oneof)="<<sizeof(x_oneof)<<"\n";
              return 0;
          }    
          

          produces:

          sizeof(x)=24
          sizeof(x_union)=32
          sizeof(x_oneof)=32

          • Joel de Guzman says:

            Yep. You can’t get away with the extra padding. In my case, I just store the discriminant (intrusively) in the largest union type in a place where it can not be overwritten by other inhabitants of the union. That is a caveat. I place this at the right/bottom-most of the largest struct but there can be some obscure OS where some weird alignment can place data such that this is overwritten. This can be tested at compile time though. For a generic variant, one can perhaps use the same trick by having an array of chars with 1+ the largest struct and place the discriminator there at the extra byte.

  3. Dainis says:

    Sometimes for tree is needed some function on its all elements .
    For u-tree its possible to doo for_each?

    • Joel de Guzman says:

      Definitely. It’s an STL container. You can also do unary and binary visitation like you do with variant, allowing you to traverse the tree any way you want.

  4. Larry Evans says:

    The reason for this:

    Thus if you have a type in there that aligns to 12 bytes, variant
    requires another 12 bytes just for the type discriminator!

    is to preserve the 12 byte alignment for the oddly aligned type when the
    variant is put into an array. By storing the type discriminant in the
    1st 12 bytes of the containing structure, then when the containing structure
    is put into a 2 element array, the 2nd array element will have the properly
    aligned oddly aligned type. Now I suppose you could avoid this restriction
    if you’re sure never to put utree into an array.

    • Joel de Guzman says:

      Yes, Larry, that is correct. 12 bytes alignment is quite odd though. 64 bit machines I am aware of typically aligns to 8 bytes (64 bits). Correct me if I’m wrong. Anyway, the point really is to make use of the padding bits and optimize the packing of the data.

  5. I like the idea of a more efficient variant. Perhaps I am misunderstanding a fundamental goal of the utree, but how would it be possible to store an arbitrary type T in it, which seems to be a requirement for creating arbitrary ASTs.

    • Joel de Guzman says:

      You don’t. Like in scheme, you only have one general data structure that can represent any data structure. In scheme S-expressions *can* represent any type of hierarchical data. You store arbitrary heterogeneous data there in an intermediate form that you later convert to your final form by walking the tree and building your actual data structures.

  6. Dainis says:

    In lisp by a sample syntax aproximately is (operation operand1 operand2 operand3 ), then how i can represent in utree operation and list of operands
    in same time when utree has only one variant value ?

    • Joel de Guzman says:

      A UTree element can also be a list.

      • Dainis says:

        Thanks for replay, I’ll try to make my tree (actually i mean operands as rule references, and operation as one of spirit rule operations )

      • Dainis says:

        Sorry for maybe dumb question, is possible to construct utree as list from
        std::list or std::vector or maybe scheme::detail::list on one line say
        scheme::utree my_utree( std::vector(21) );
        I spend 3 hours to quess right syntax, but no results…,
        only errors like …
        C:\csource\utree\test.cpp:117: error: no matching function for call to ‘scheme::utree::utree(scheme::detail::list&)’
        C:\csource\utree\detail/utree_detail2.hpp:979: note: candidates are: scheme::utree::utree(scheme::utree::construct_list)

        • Dainis says:

          Sorry on more time I have lost < > ,
          std::list<int > or std::vector<int > or maybe scheme::detail::list on one line say
          scheme::utree my_utree( std::vector<int >(21) );
          I spend 3 hours to quess right syntax, but no results…,
          only errors like …
          C:\csource\utree\test.cpp:117: error: no matching function for call to ‘scheme::utree::utree(scheme::detail::list&)’
          C:\csource\utree\detail/utree_detail2.hpp:979: note: candidates are: scheme::utree::utree(scheme::utree::construct_list)

        • Hartmut Kaiser says:

          utree has an additional constructor (not shown in the listing above) taking a boost::iterator_range:

                  template <typename Iter>
                  utree(boost::iterator_range<Iter> r);
          

          which should do what you want.

          Regrads Hartmut

          • Dainis says:

            Thanks Hartmut, that works, just now I understand that if
            utree is constructed as list , only makes a sense to construct it as list of
            of utrees ;-)

  7. Xavi Gratal says:

    That looks good. I will certainly have a use for this.
    But I have a concern… isn’t storing members of a union as one type and then reading them as another forbidden by the standard. IIRC it breaks the strict-aliasing rule, and it can cause problems with certain compilers that do optimization taking advantage of the aliasing rules.
    It would probably be better to store it as an unsigned char array and then cast as needed (casting to and from char pointers is allowed by the standard).

  8. Olaf says:

    How can I handle optional, even rekursive??

  9. Olaf says:

    Only for interest: Why do you wrote your own list implementation and don’t use std::list? Why is there no allocator interface like the std::list has which would allow to use pools, e.g. boost’s fast_pool_allocator? Would be any benefit from them?

    • Joel de Guzman says:

      1) size constraints. std::list is 16 bytes last I checked (32 bit compiler) 2) I needed to keep the elements in the variant PODs to simplify the implementation. It’s a lot easier this way than having to rewrite variant for our very tuned and specific application.

Leave a Reply

To use RetinaPost you must register at http://www.RetinaPost.com/register