Pass the parser please
I've been tinkering with a new search engine. The first trick, was to build a parser to turn a recipe into an orderly format. The good side: all I needed to parse out is the ingredients. The bad side: funky recipes are not prone to parsing. My site doesn't really store recipes: it just organizes the data so that searches are relevant and the result link to the offsite resource.
When I looked to see if anyone else was doing it, I found a recipe XML format (which is very easy to parse). It had some issues with that format, so I modified it a little.
Then, I looked for a natural language parser. One site, Recipezaar, toiled under this task for two years (http://www.decafbad.com/blog/2003/11/14/the_recipe_web ). I got version 0.1 in two weeks: a parser for foodtv.ca recipes. All of the sites are a little bit different, so I'm going to build a modified parser for each site I hope to crawl.
If you want to add a recipe resource, please feel free: http://dewolfe001.dotgeek.org/EmptyCupboard/addsite.php
If you want to see a recipe of mine and its recipe XML counterpart, be my guest.
When I looked to see if anyone else was doing it, I found a recipe XML format (which is very easy to parse). It had some issues with that format, so I modified it a little.
Then, I looked for a natural language parser. One site, Recipezaar, toiled under this task for two years (http://www.decafbad.com/blog
If you want to add a recipe resource, please feel free: http://dewolfe001.dotgeek.org/EmptyCupboard/addsite.php
If you want to see a recipe of mine and its recipe XML counterpart, be my guest.
Comments