Blogged by Ujihisa. Standard methods of programming and thoughts including Clojure, Vim, LLVM, Haskell, Ruby and Mathematics written by a Japanese programmer. github/ujihisa

Sunday, July 26, 2009

Parsers around Ruby

I'll show some parsers for ruby or parsers on ruby include parse_tree, ruby_parser, treetop and ripper.

Within this entry, there are only two definitions of ruby's syntax. The one is parser.y which is in MRI, and ruby_parser.y which is only in the ruby_parser library.

parse_tree

A parser which converts code in Ruby to S-expression. It is written in Ruby with C extension.

parse_tree uses ruby standard library racc. ([Added on July 28]: It was wrong: parse_tree doesn't use racc.)

require 'rubygems'
require 'parse_tree'

a = ParseTree.translate '1+1'
p a.class #=> Array
p a #=> [:call, [:lit, 1], :+, [:array, [:lit, 1]]]

or simply

$ echo "1+1" | parse_tree_show -f
s(:call, s(:lit, 1), :+, s(:arglist, s(:lit, 1)))

While parse_tree is a gem library for some ruby implementations, it is a standard library in Rubinius.

parse_tree uses C function rb_compile_string which is defined in MRI's parse.c.

Note that parse_tree cannot accept a string represented code instead of a method or a class. ([Added on July 28]: It was wrong: parse_tree can accept both.)

ruby_parser

Exactly same as parse_tree except that it is written in pure Ruby.

require 'rubygems'
require 'ruby_parser'

a = RubyParser.new.parse '1+1'
p a.class #=> Sexp
p a #=> s(:call, s(:lit, 1), :+, s(:arglist, s(:lit, 1)))

or

$ ruby_parse a.rb # Unfortunately ruby_parse command handles only actual files.

ruby_parser has own yacc file ruby_parser.y which has 1789 lines. It will be processed by racc.

Rubinius uses ruby_parser as its ruby parser.

Note that ruby_parser cannot accept a method or a class instead of a string represented code.

treetop

Treetop is a packrat parser written in pure Ruby. Treetop has an original syntax.

c.f.

Ripper

The ruby parser written in C. It is a ruby 1.9 standard library. It uses parse.y as ruby itself does.

2 comments:

  1. Parse Tree doesn't use racc at all.

    "Note that parse_tree cannot accept a string represented code instead of a method or a class." -- is also false, or I don't understand what it means.

    ReplyDelete
  2. I made mistakes. Both sentences were not correct. Thank you, zenspider! I added a note on this entry.

    ReplyDelete

Followers