Treetop Parser
I’ve been doing some work with Coco/R for Ruby lately. I understand that Coco/R is classic. I understand that in some languages, it might be great. But the Ruby implementation of it is a half-finished, hacked-together, piece of crap.
Seriously. Its rubbish.
So, anyway, I’ve spent a couple days at the office trying to get it to parse these, fairly complicated strings of text. An example might look something like…
Some Title Name #5 Vol. 01 Subtitle:Subsubtitle (AAA123456) (More Extra Data) TYLER CVR A
That in itself would not be hard… its the fact that they’re all horribly different. Is data consistency really that freaking hard?! ...But thats a whole other digression.
Anyway… Getting to the point. At RubyConf, Nathan Sobo of Pivotal Labs introduced a parser written in Ruby. It uses a completely different theory than traditional compilers. I haven’t looked into the gory details much, so I can’t really comment.
What I can comment on, however, is the fact that it works really well. I decided to toy around with a bit, before I switch my project at work to it. So, I watched the screencast which Nathan put together, and I got to work…
After maybe 30 minutes of hacking on it, I have what I believe to be a pretty decent CSS parser. It lacks some things at the moment, specifically application… But whatever. Here it is, for your amusement:
grammar Css
rule stylesheet
whitespace rule_set* whitespace
end
rule rule_set
whitespace selector+ whitespace '{' whitespace instruction* whitespace '}'
end
rule selector
selector_key whitespace
end
rule selector_key
[a-zA-Z#.:]+
end
rule instruction
instruction_key whitespace ':' whitespace instruction_value ';' whitespace
end
rule instruction_key
[a-z-]+
end
rule instruction_value
[a-z]+
end
rule whitespace
[\s]*
end
end
I’m sure it can be done more elegantly and more solidly… but for a first pass in 30 minutes, I’m pleased. What strikes me most of all is how easy it is. Easy to get setup, easy to write grammar files for, and easy to use.
It took me significantly longer than this just to get Coco/R running…