To compare the three parser generators, here I'll show an easy sample written in them.
Target Syntax
The following codes are accepted.
- "
1+2
" - "
23423423432 + 923401
" - "
23432 + 2
"
The compiler will calculate the single addition and shows the value.
The following codes aren't accepted.
- "
1+2+3
" - "
1-2
" - "
1+
"
JavaCC
The following code is from Standard Compiler by Minero Aoki.
The filename is A.jj
.
PARSER_BEGIN(A)
import java.io.*;
class A {
static public void main(String[] args) {
for (String arg : args) {
try {
System.out.println(evaluate(arg));
}
catch (ParseException ex) {
System.err.println(ex.getMessage());
}
}
}
static public long evaluate(String src) throws ParseException {
Reader reader = new StringReader(src);
return new A(reader).expr();
}
}
PARSER_END(A)
SKIP: { <[" ", "\t", "\r", "\n"]> }
TOKEN: {
<INTEGER: (["0"-"9"])+>
}
long expr():
{
Token x, y;
}
{
x=<INTEGER> "+" y=<INTEGER> <EOF>
{
return Long.parseLong(x.image) + Long.parseLong(y.image);
}
}
To run this, type the following commends
$ javacc A.jj
$ javac A.java
$ java A '1 + 3'
4
This build process produces the following files automatically.
- A.class
- A.java
- AConstants.class
- AConstants.java
- ATokenManager.class
- ATokenManager.java
- ParseException.class
- ParseException.java
- SimpleCharStream.class
- SimpleCharStream.java
- Token.class
- Token.java
- TokenMgrError.class
- TokenMgrError.java
Yacc and Lex
a.y
%token NUMBER
%%
expr : NUMBER '+' NUMBER { printf("%d", $1 + $3); }
%%
#include <stdio.h>
#include "lex.yy.c"
a.l
%{
#include "a.tab.h"
%}
%%
[0-9]+ {sscanf(yytext,"%d",&yylval); return(NUMBER);}
[ \r\n\t] ;
. return(yytext[0]);
%%
#ifndef yywrap
yywrap() { return 1; }
#endif
run
$ bison -d a.y && lex a.l
$ gcc a.tab.c -ly -ll
$ ./a.out
files automatically generated
- a.out
- a.tab.c
- a.tab.h
- lex.yy.c
Racc
I referred this blog entry http://d.hatena.ne.jp/morphine57/20090824/1251129740
a.racc
class A
token NUM
rule
expr : NUM '+' NUM { result = val[0] + val[2] }
end
---- header
require 'strscan'
---- inner
def parse(str)
@tokens = []
s = StringScanner.new(str)
until s.eos?
case
when s.scan(/[0-9]+/)
@tokens << [:NUM, s[0].to_i]
when s.skip(/[ \t\r\n]/)
else
@tokens << [s.getch, nil]
end
end
do_parse()
end
def next_token
@tokens.shift
end
And runs
$ racc a.racc
$ ruby192 -r ./a.tab.rb -e 'p A.new.parse "1+ 2"'
3
Files automatically generated
- a.tab.rb
Statistics
The lines of code which I have to write by myself:
- JavaCC: 39
- Yacc: 9 + 11
- Racc: 26
The lines of code which are automatically generated:
- JavaCC: 1446
- Yacc: 3280
- Racc: 117 (assuming the gem library
racc
is already installed)
For java, it's too producing files.i think so.
ReplyDeleteThe only thing I would suggest is that if you don't mind a static parser, you can eliminate the "options" part of the grammar completely. That only saves 3 lines, though...
ReplyDeleteThanks!
ReplyDelete(I updated this blog entry now to eliminate the "options" and added the lists of automagically generated by yacc and racc)