YaCy is a freedistributed search engine, built on principles of peer-to-peer networks. Its core is a computer program written in Java distributed on several hundred computers,, so-called YaCy-peers. Each YaCy-peer independently crawls through the Internet, analyzes and indexes found web pages, and stores indexing results in a common database which is shared with other YaCy-peers using principles of P2P networks. It is a free search engine that everyone can use to build a search portal for their intranet and to help search the public internet clearly. Compared to semi-distributed search engines, the YaCy-network has a decentralised architecture. All YaCy-peers are equal and no central server exists. It can be run either in a crawling mode or as a local proxy server, indexing web pages visited by the person running YaCy on his or her computer.. Access to the search functions is made by a locally running web server which provides a search box to enter search terms, and returns search results in a similar format to other popular search engines. YaCy was created in 2003 by Michael Christen.
System components
YaCy search engine is based on four elements: ;Crawler: A search robot that traverses from web page to web page and analyzes their content. ;Indexer: Creates a reverse word index i.e. each word from the RWI has its list of relevant URLs and ranking information. Words are saved in form of word hashes. ;Search and administration interface: Made as a web interface provided by a local HTTP servlet with servlet engine. ;Data storage: Used to store the reverse word index database utilizing a distributed hash table.
YaCy harvests web pages with a web crawler. Documents are then parsed, indexed and the search index is stored locally. If your peer is part a peer network, then your local search index is also merged into the shared index for that network.
A search is started then the local index contributes together with a global search index from peers in the YaCy search network.
YaCy platform architecture
YaCy uses a combination of techniques for the networking, administration, and maintenance of indexing the search engine including blacklisting, moderation, and communication with the community. Here is how YaCy performs these operations:
YaCy is available on Windows, Mac and Linux. The Debian package can be installed from a repository available at the subdomain of the project's website. The package is not maintained in the official Debianpackage repository yet.