Skip to content

Parsing Version 4

Skipping version 3 entirely (let’s just not go there, shall we?), I have now a version 4 on the web. It currently does arithmetic without parentheses.

But what it does do is take the arithmetic expression(s) and parse them into trees, bottom-up with the computed values in the nodes and operations above. The graphics are not pretty just yet, but it works and it is completely in the browser–no server-side shenanigans. This is all thanks to jsxgraph; they just saved me months of work, work which, unlike parser writing, I had no desire to do.

Below the graph is a bunch of strange looking code–this is text that can be fed into graphviz for a different graph view.

The really nice thing about this version is that I define the grammar and behavior write in the index.html file. I have achieved separation of the general parser stuff from the behavior of the language I want to deal with.

This is the definition of arithmetic:

    var simpleParSet = {symbolDefinitions:{
                "+": {types:{infix:20}},
                "-": {types:{infix:20}},
                "*": {types:{infix:30}},
                "/": {types:{infix:30}},
                "^": {types:{infix:40}},
                "\u00AC": {types:{symbol:0}}
                }
    };

    var par = prattCrockford.makeTree(simpleParSet);

And to make a tree, the call is

tree = new par({str:"3+7*5^2"});

And that’s it. Tree defined. It has no behavior.

To add behavior, I have a seeder method that can install behavior into the nodes. This can be done globally across all parsers, across all trees for a language, or for a single tree. With prototype chaining this is easy; I call seeder at the level I want even though it is defined at the bottom of the chain. Awesome.

Once seeder is done, you can then call walker to walk the tree or shooter to walk the parse order. And that’s it.

Here is an example of the use of the seeder function to seed all of this language’s trees:

    var makeCompute = function (fun) {
        return function () {
            this.computedValue = fun (this.nodes.left.compute(), this.nodes.right.compute());
            return this.computedValue;
        }
    };
    par.prototype.seeder({
        "(number)" : function(sym) {
             sym.prototype.compute = function () {
                  this.computedValue = this.value;
                 return this.value;};
        },
        "+" : function (sym) {
                sym.prototype.compute = makeCompute(function (l,r) {return l+r;});
        },
        "-" : function (sym) {
                sym.prototype.compute = makeCompute(function (l,r) {return l-r;});
        },
        "*" : function (sym) {
                sym.prototype.compute = makeCompute(function (l,r) {return l*r;});
        },
        "/" : function (sym) {
                sym.prototype.compute = makeCompute(function (l,r) {return l/r;});
        },
        "^" : function (sym) {
                sym.prototype.compute = makeCompute(function (l,r) {return Math.pow(l,r);});
        },

        "\u00AC" : function(sym) {
                sym.prototype.compute = $.noop;
        },
        "(end)" : function(sym) {
             sym.prototype.compute = $.noop;
        }
        },
        function (sym) {
            sym.prototype.compute = function () {this.computedValue = NaN; return this.computedValue;}
        }
    );

Notice how I created a helper function makeCompute that essentially reduces boilerplate code and allows one to specify just the differing behavior. All sorts of tricks like this can be used to make this more efficient. I also have a couple of $.noop’s for symbols that should not be processed (end of lines, for example). I will probably make a common convention to deal with them so that that boilerplate can be removed. I may remove them from the trunk of the tree altogether and only leave them in the source token stream.

Anyway, now that we have the seeder function, now what?

        tree = new par({str:"3+4\n7*5^2+1});
        $.log(tree.walker("compute"));

The return value will be an array of the computed values of the different arithmetic expressions in the input: [7, 176]

And that is how to write a parser. I can continue to add behaviors of all kind, using the behavior data I already generated, and even remove behaviors later if I want or overwrite. I have complete flexibility and power on how to make my trees behave.

Now I need to add parsing behaviors.

Post a Comment

Your email is never published nor shared.