| |||||||||
Nutch is an effort to build an open source search engine. It uses Lucene for the search and index component. The fetcher (robot) has been written from scratch solely for this project.
Nutch is has a highly modula architecture allowing developers to create plug ins for the following activities: media-type parsing, data retrieval, querying and clustering.
Tim O'Reilly has a seat in Nutch's board of directors.
Doug Cutting is the lead developer.
As of 2003 it is completely coded in Java, but data is written in language-independent formats. In June 2003 there was a successful 100 million page demo system.