FastNews.TV

eBuyZilla

New York Post

Friday, June 3, 2011

 

Zanran is an interesting search start-up that both indexes and maps the numerical content-finding 'semi-structured' (as they put it) data on the web. Such data could be a graph in a PDF report, a table in an Excel spreadsheet, or a bar chart shown as an image on an HTML page.

Currently, the search engine extracts tables and images from HTML, PDF and Excel files. In the near future it will also process PowerPoint and Word documents. The system examines millions of images and determines whether each one is a graph, chart or table – and if has numerical content.

View full post on Search Engine Journal

No comments: