Standards: Yacc, JavaCC and Racc

To compare the three parser generators, here I'll show an easy sample written in them.

Target Syntax

The following codes are accepted.

"1+2"
"23423423432 + 923401"
"23432 + 2"

The compiler will calculate the single addition and shows the value.

The following codes aren't accepted.

"1+2+3"
"1-2"
"1+"

JavaCC

The following code is from Standard Compiler by Minero Aoki.

The filename is A.jj.

PARSER_BEGIN(A)
import java.io.*;

class A {
  static public void main(String[] args) {
    for (String arg : args) {
      try {
        System.out.println(evaluate(arg));
      }
      catch (ParseException ex) {
        System.err.println(ex.getMessage());
      }
    }
  }

  static public long evaluate(String src) throws ParseException {
    Reader reader = new StringReader(src);
    return new A(reader).expr();
  }
}
PARSER_END(A)

SKIP: { <[" ", "\t", "\r", "\n"]> }

TOKEN: {
  <INTEGER: (["0"-"9"])+>
}

long expr():
{
  Token x, y;
}
{
  x=<INTEGER> "+" y=<INTEGER> <EOF>
    {
      return Long.parseLong(x.image) + Long.parseLong(y.image);
    }
}

To run this, type the following commends

$ javacc A.jj
$ javac A.java
$ java A '1 +  3'
4

This build process produces the following files automatically.

A.class
A.java
AConstants.class
AConstants.java
ATokenManager.class
ATokenManager.java
ParseException.class
ParseException.java
SimpleCharStream.class
SimpleCharStream.java
Token.class
Token.java
TokenMgrError.class
TokenMgrError.java

Yacc and Lex

a.y

%token NUMBER

%%

expr : NUMBER '+' NUMBER { printf("%d", $1 + $3); }

%%
#include <stdio.h>
#include "lex.yy.c"

a.l

%{
#include "a.tab.h"
%}
%%
[0-9]+    {sscanf(yytext,"%d",&yylval); return(NUMBER);}
[ \r\n\t]   ;
.         return(yytext[0]);
%%
#ifndef yywrap
yywrap() { return 1; }
#endif

run

$ bison -d a.y && lex a.l
$ gcc a.tab.c -ly -ll
$ ./a.out

files automatically generated

a.out
a.tab.c
a.tab.h
lex.yy.c

Racc

I referred this blog entry http://d.hatena.ne.jp/morphine57/20090824/1251129740

a.racc

class A
  token NUM
rule
   expr : NUM '+' NUM { result = val[0] + val[2] }
end
---- header
require 'strscan'
---- inner
  def parse(str)
    @tokens = []
    s = StringScanner.new(str)
    until s.eos?
      case
      when s.scan(/[0-9]+/)
        @tokens << [:NUM, s[0].to_i]
      when s.skip(/[ \t\r\n]/)
      else
        @tokens << [s.getch, nil]
      end
    end
    do_parse()
  end

  def next_token
    @tokens.shift
  end

And runs

$ racc a.racc
$ ruby192 -r ./a.tab.rb -e 'p A.new.parse "1+ 2"'
3

Files automatically generated

a.tab.rb

Statistics

The lines of code which I have to write by myself:

JavaCC: 39
Yacc: 9 + 11
Racc: 26

The lines of code which are automatically generated:

JavaCC: 1446
Yacc: 3280
Racc: 117 (assuming the gem library racc is already installed)

3 comments:

morphine57December 20, 2009 at 5:28 PM
For java, it's too producing files.i think so.
Tom CopelandDecember 20, 2009 at 10:42 PM
The only thing I would suggest is that if you don't mind a static parser, you can eliminate the "options" part of the grammar completely. That only saves 3 lines, though...
ujmDecember 21, 2009 at 3:05 AM
Thanks!

(I updated this blog entry now to eliminate the "options" and added the lists of automagically generated by yacc and racc)

Standards

Sunday, December 20, 2009

Yacc, JavaCC and Racc

Target Syntax

JavaCC

Yacc and Lex

Racc

Statistics

3 comments:

About Me

Blog Archive

Followers