|  | Home | Libraries | People | FAQ | More | 
As a prerequisite in understanding this tutorial, please review the previous employee example. This example builds on top of that example.
Stop and think about it... We're actually generating ASTs (abstract syntax trees) in our previoius examples. We parsed a single structure and generated an in-memory representation of it in the form of a struct: the struct employee. If we changed the implementation to parse one or more employees, the result would be a std::vector<employee>. We can go on and add more hierarchy: teams, departments, corporations, etc. We can have an AST representation of it all.
        This example shows how to annotate the AST with the iterator positions for
        access to the source code when post processing using a client supplied on_success handler. The example will show
        how to get the position in input source stream that corresponds to a given
        element in the AST.
      
        In addition, This example also shows how to "inject" client data,
        using the "with" directive, that the on_success
        handler can access as it is called within the parse traversal through the
        parser's context.
      
The full cpp file for this example can be found here: annotation.cpp
        First, we'll update our previous employee struct, this time separating the
        person into its own struct. So now, we have two structs, the person and the employee.
        Take note too that we now inherit person
        and employee from x3::position_tagged
        which provides positional information that we can use to tell the AST's position
        in the input stream anytime.
      
namespace client { namespace ast { struct person : x3::position_tagged { person( std::string const& first_name = "" , std::string const& last_name = "" ) : first_name(first_name) , last_name(last_name) {} std::string first_name, last_name; }; struct employee : x3::position_tagged { int age; person who; double salary; }; }}
Like before, we need to tell Boost.Fusion about our structs to make them first-class fusion citizens that the grammar can utilize:
BOOST_FUSION_ADAPT_STRUCT(client::ast::person, first_name, last_name ) BOOST_FUSION_ADAPT_STRUCT(client::ast::employee, age, who, salary )
        Before we proceed, let me introduce a helper class called the position_cache. It is a simple class that
        collects iterator ranges that point to where each element in the AST are
        located in the input stream. Given an AST, you can query the position_cache
        about AST's position. For example:
      
auto pos = positions.position_of(my_ast);
        Where my_ast is the AST,
        positions and is the position_cache, position_of
        returns an iterator range that points to the start and end (pos.begin() and pos.end())
        positions where the AST was parsed from. positions.begin() and positions.end()
        points to the start and end of the entire input stream.
      
        The on_success gives you
        everything you want from semantic actions without the visual clutter. Declarative
        code can and should be free from imperative code. on_success
        as a concept and mechanism is an important departure from how things are
        done in Spirit's previous version: Qi.
      
As demonstrated in the previous employee example, the preferred way to extract data from an input source is by having the parser collect the data for us into C++ structs as it traverses the input stream. Ideally, Spirit X3 grammars are fully attributed and declared in such a way that you do not have to add any imperative code and there should be no need for semantic actions at all. The parser simply works as declared and you get your data back as a result.
        However, there are certain cases where there's no way to avoid introducing
        imperative code. But semantic actions messes up our clean declarative grammars.
        If we care to keep our code clean, on_success
        handlers are alternative callback hooks to client code that are executed
        by the parser after a successful parse without polluting the grammar. Like
        semantic actions, on_success
        handlers have access to the AST, the iterators, and context. But, unlike
        semantic actions, on_success
        handlers are cleanly separated from the actual grammar.
      
        As discussed, we annotate the AST with its position in the input stream with
        our on_success handler:
      
// tag used to get the position cache from the context struct position_cache_tag; struct annotate_position { template <typename T, typename Iterator, typename Context> inline void on_success(Iterator const& first, Iterator const& last , T& ast, Context const& context) { auto& position_cache = x3::get<position_cache_tag>(context).get(); position_cache.annotate(ast, first, last); } };
        position_cache_tag is a special
        tag we will use to get a reference to the actual position_cache,
        client data that we will inject at very start, when we call parse. More on
        that later.
      
        Our on_success handler gets
        a reference to the actual position_cache
        and calls its annotate member
        function, passing in the AST and the iterators. position_cache.annotate(ast,
        first,
        last)
        annotates the AST with information required by x3::position_tagged.
      
Now we'll write a parser for our employee. To simplify, inputs will be of the form:
{ age, "forename", "surname", salary }
namespace parser { using x3::int_; using x3::double_; using x3::lexeme; using ascii::char_; struct quoted_string_class; struct person_class; struct employee_class; x3::rule<quoted_string_class, std::string> const quoted_string = "quoted_string"; x3::rule<person_class, ast::person> const person = "person"; x3::rule<employee_class, ast::employee> const employee = "employee"; auto const quoted_string_def = lexeme['"' >> +(char_ - '"') >> '"']; auto const person_def = quoted_string >> ',' >> quoted_string; auto const employee_def = '{' >> int_ >> ',' >> person >> ',' >> double_ >> '}' ; auto const employees = employee >> *(',' >> employee); BOOST_SPIRIT_DEFINE(quoted_string, person, employee); }
struct quoted_string_class; struct person_class; struct employee_class; x3::rule<quoted_string_class, std::string> const quoted_string = "quoted_string"; x3::rule<person_class, ast::person> const person = "person"; x3::rule<employee_class, ast::employee> const employee = "employee";
Go back and review the original employee parser. What has changed?
quoted_string, person
            and employee.
          quoted_string_class,
            person_class, and employee_class.
          
        Like before, in this example, the rule classes, quoted_string_class,
        person_class, and employee_class provide statically known
        IDs for the rules required by X3 to perform its tasks. In addition to that,
        the rule class can also be extended to have some user-defined customization
        hooks that are called:
      
        By subclassing the rule class from a client supplied handler such as our
        annotate_position handler
        above:
      
struct person_class : annotate_position {}; struct employee_class : annotate_position {};
        The code above tells X3 to check the rule class if it has an on_success or on_error
        member functions and appropriately calls them on such events.
      
        For any parser p, one can
        inject supplementary data that semantic actions and handlers can access later
        on when they are called. The general syntax is:
      
with<tag>(data)[p]
        For our particular example, we use to inject the position_cache
        into the parse for our annotate_position
        on_success handler to have access to:
      
auto const parser = // we pass our position_cache to the parser so we can access // it later in our on_sucess handlers with<position_cache_tag>(std::ref(positions)) [ employees ];
        Typically this is done just before calling x3::parse
        or x3::phrase_parse. with
        is a very lightwight operation. It is possible to inject as much data as
        you want, even multiple with
        directives:
      
with<tag1>(data1) [ with<tag2>(data2)[p] ]
        Multiple with directives
        can (perhaps not obviously) be injected from outside the called function.
        Here's an outline:
      
template <typename Parser> void bar(Parser const& p) { // Inject data2 auto const parser = with<tag2>(data2)[p]; x3::parse(first, last, parser); } void foo() { // Inject data1 auto const parser = with<tag1>(data1)[my_parser]; bar(p); }
Now we have the complete parse mechanism with support for annotations:
using iterator_type = std::string::const_iterator; using position_cache = boost::spirit::x3::position_cache<std::vector<iterator_type>>; std::vector<client::ast::employee> parse(std::string const& input, position_cache& positions) { using boost::spirit::x3::ascii::space; std::vector<client::ast::employee> ast; iterator_type iter = input.begin(); iterator_type const end = input.end(); using boost::spirit::x3::with; // Our parser using client::parser::employees; using client::parser::position_cache_tag; auto const parser = // we pass our position_cache to the parser so we can access // it later in our on_sucess handlers with<position_cache_tag>(std::ref(positions)) [ employees ]; bool r = phrase_parse(iter, end, parser, space, ast); // ... Some error checking here return ast; }
Let's walk through the code.
        First, we have some typedefs for 1) The iterator type we are using for the
        parser, iterator_type and
        2) For the position_cache
        type. The latter is a template that accepts the type of container it will
        hold. In this case, a std::vector<iterator_type>.
      
        The main parse function accepts an input, a std::string and a reference to
        a position_cache, and retuns an AST: std::vector<client::ast::employee>.
      
Inside the parse function, we first create an AST where parsed data will be stored:
std::vector<client::ast::employee> ast;
        Then finally, we create a parser, injecting a reference to the position_cache, and call phrase_parse:
      
using client::parser::employees; using client::parser::position_cache_tag; auto const parser = // we pass our position_cache to the parser so we can access // it later in our on_sucess handlers with<position_cache_tag>(std::ref(positions)) [ employees ]; bool r = phrase_parse(iter, end, parser, space, ast);
        On successful parse, the AST, ast,
        will contain the actual parsed data.
      
Now that we have our main parse function, let's have an example sourcefile to parse and show how we can obtain the position of an AST element, returned after a successful parse.
Given this input:
std::string input = R"( { 23, "Amanda", "Stefanski", 1000.99 }, { 35, "Angie", "Chilcote", 2000.99 }, { 43, "Dannie", "Dillinger", 3000.99 }, { 22, "Dorene", "Dole", 2500.99 }, { 38, "Rossana", "Rafferty", 5000.99 } )";
        We call our parse function after instantiating a position_cache
        object that will hold the source stream positions:
      
position_cache positions{input.begin(), input.end()}; auto ast = parse(input, positions);
        We now have an AST, ast,
        that contains the parsed results. Let us get the source positions of the
        2nd employee:
      
auto pos = positions.position_of(ast[1]); // zero based of course!
        pos is an iterator range
        that contians iterators to the start and end of ast[1]
        in the input stream.
      
If you read the previous Program Structure tutorial where we separated various logical modules of the parser into separate cpp and header files, and you are wondering how to provide the context configuration information (see Config Section), we need to supplement the context like this:
using phrase_context_type = x3::phrase_parse_context<x3::ascii::space_type>::type; typedef x3::context< error_handler_tag , std::reference_wrapper<position_cache> const , phrase_context_type> context_type;