Yandex Search
Yandex Search is a web search engine which is owned by the Russian corporation Yandex. It is the core product of Yandex. In January 2015 Yandex Search generated 51.2% of all of the search traffic in Russia according to.
About
The search technology provides local search results in more than 1,400 cities. Yandex Search also features “parallel” search that presents results from both main web index and specialized information resources, including news, shopping, blogs, images and videos on a single page.Yandex Search is responsive to real-time queries, recognizing when a query requires the most current information, such as breaking news or the most recent post on Twitter on a particular topic. It also contains some additional features: Wizard Answer, which provides additional information, spell checker, autocomplete which suggests queries as-you-type, antivirus that detects malware on webpages and so on.
In May 2010, Yandex launched Yandex.com, a platform for beta testing and improving non-Russian language search.
The search product can be accessed from personal computers, mobile phones, tablets and other digital devices. In addition to web search, Yandex provides a wide range of specialized search services.
In 2009, Yandex launched MatrixNet, a new method of machine learning that significantly improves the relevance of search results. It allows the Yandex’ search engine to take into account a very large number of factors when it makes the decision about relevancy of search results.
Another technology, Spectrum, was launched in 2010. It allows inferring implicit queries and returning matching search results. The system automatically analyses users' searches and identifies objects like personal names, films or cars. Proportions of the search results responding to different user intents are based on the user demand for these results.
With the first release on July 21, 2017, Brave web browser features Yandex as one of its default search engines.
Functionality
Basic Information
The search engine consists of three main components:- An agent is a search robot. He bypasses the network, downloads and analyzes documents. If a new link is found during site analysis, it falls into the list of web addresses of the robot. Search robots are of the following types: spiders - download sites like the user's browsers ; Crawler - discover new, still unknown links based on the analysis of already known documents; indexers - analyze the detected web pages and add data to the index. Many deflated documents are divided into disjoint parts and are cleared from the markup.
- Index is a database compiled by search engine indexing robots. Documents are searched in the index.
- Search engine. The search request from the user is sent to the least loaded server after analyzing the load of the search system. To provide such an opportunity, Yandex servers are clustered. Then, the user request is processed by a program called "Metapoisk". Metapoisk analyzes the request in real time: it determines the geographic location of the user, conducts linguistic analysis, etc. The program also determines whether the request belongs to the category of the most popular or recently defined.The issuance of such requests for some time is stored in the memory of the metasearch, and in case of a match, previously saved results are displayed. If the request is rare and there are no matches in the cache, the system redirects it to the Basic Search program. It analyzes the system index, which is also divided into different duplicate servers. Then the received information again falls into meta-search, the data is ranked and shown to the user in a final form.
Indexing
The search engine is also able to index text inside Shockwave Flash objects, if these elements are transferred as a separate page, which has the MIME type
application/x-shockwave-flash
, and files with the extension.swfYandex has 2 scanning robots - the “main” and the “fast”. The first is responsible for the whole Internet, the second indexes sites with frequently changing and updating information. In 2010, the “fast” robot received a new technology called “Orange”, developed jointly by the California and Moscow divisions of Yandex.
Since 2009, Yandex has supported Sitemaps technology.
Server logs
In the server logs, Yandex robots are represented as follows:-
Mozilla/5.0
is the main indexing robot. -
Mozilla/5.0
- a robot that detects site mirrors. If there are several sites with the same content, only one will be shown in the search results. Mozilla/5.0
- indexer Yandex. Images.Mozilla/5.0
- indexer Yandex. Video.Mozilla/5.0
- indexer multimedia data.Mozilla/5.0
is a search robot that indexes post comments.Mozilla/5.0
- is a search robot hat indexes pages through the "Add URL" form.Mozilla/5.0
- checking Yandex. Direct.Mozilla/5.0
- indexer Yandex. Metrics.Mozilla/5.0
- checking Yandex. Catalog.-
Mozilla/5.0
- indexer Yandex. News. -
Mozilla/5.0
- Yandex anti-virus robot.https://yandex.com/support/mail/web/letter/query-language.html Query language
-
""
- exact quote -
|
- enter between words, if you need to find one of them -
*
- enter between words, if some word is missing -
site:
- search on a specific site -
date:
- search for documents by date, for example, date: 2007 -
+
- enter before the word, that should be in the document
Search results
Yandex, automatically, along with the original “exact form” of the query, searches for its various variations and formulations.The Yandex search takes into account the morphology of the Russian language, therefore, regardless of the form of the word in the search query, the search will be performed for all word forms. If morphological analysis is undesirable, you can put an exclamation mark Before the word - the search in this case will show only the specific form of the word. In addition, the search query practically does not take into account the so-called stop-words, that is, prepositions, punctuation, pronouns, etc., due to their wide distribution
As a rule, abbreviations are automatically disclosing, spelling is correcting. It also searches for synonyms. The extension of the original user request depends on the context. Expansion does not occur when a set of highly specialized terms, names of proper names of companies, adding the word “price”, in exact quotes.
Search results for each user are formed individually based on their location, language of a query, interests and preferences based on the results of previous and current search sessions. However, the key factor in ranking search results is their relevance to the search query. Relevance is determined based on a ranking formula, which is constantly updated based on machine learning algorithms.
The search is performed in Russian, English, French, German, Ukrainian, Belarusian, Tatar, Kazakh.
Search results can be sorted by relevance and by date.
The page with the search results consists of 10 links with short annotations - “snippets”. The snippets includes a text comment, link, address, popular sections of the site, pages on social networks, etc. As an alternative to snippets, Yandex introduced in 2014 a new interface called “Islands”.
Yandex implements the “parallel searches” mechanism, when together with a web search, a search is performed on Yandex services, such as Catalog, News, Market, Encyclopedias, Images, etc. As a result, in response to a user’s request, the system shows not only textual information, but also links to video files, pictures, dictionary entries, etc.
A distinctive feature of the search engine is also the technology of "intent search" that mean a search for solving a problem. Intent search elements are - dialog prompts in case of ambiguous request, automatic text translation, information about the characteristics of the requested car, etc. For example, when you request “Boris Grebenshchikov - Golden City”, the system will show a form for online listening to music from the Yandex Music service, at the request of "st. Koroleva 12 " will be shown a fragment of the map with the marked object on it.
Spam and Virus Protection
In 2013, Yandex was considered by some to be the safest search engine at the time and the third most secure among all web resources.. By 2016, Yandex had slipped down to third with Google being first.Checking web pages and warning users appeared on Yandex in 2009: since then, on the search results page, next to a dangerous site there is a note “This site may threaten the security of your computer”. Two technologies at once are used to detect threats. The first was purchased from the American antivirus Sophos and based on a signature approach: that means, when accessing a web page, the antivirus system also accesses a database of already known viruses and malware. This approach is fast, but practically powerless against new viruses that have not yet entered the database. Therefore, Yandex along with the signature also uses its own antivirus complex, based on an analysis of the behavioral factor. The Yandex program, when accessing the site, checks whether the latter requested additional files from the browser, redirected it to an extraneous resource, etc. Thus, if information is received that the site begins to perform certain actions without user permission, it is placed in the “black list” and in the database of virus signatures. Information about the infection of the site appears in the search results, and through the Yandex.Webmaster service the owner of the site receives a notification. After the first check, Yandex does the second, and if the infection information is confirmed a second time, the checks will be more frequent until the threat is eliminated. The total number of infected sites in the Yandex database does not exceed 1%.
Every day in 2013, Yandex checks 23 million web pages and shows users 8 million warnings. Approximately one billion sites are checked monthly.
Search Ranking
For a long time, the key ranking factor for Yandex was the number of third-party links to a particular site. Each page on the Internet was assigned a unique citation index, similar to the index for authors of scientific articles: the more links, the better. A similar mechanism was implemented in the Yandex and in the Google’s PageRank. In order to prevent cheating, Yandex uses multivariate analysis, in which only 70 of the 800 factors are affected by the number of third-party links. Today, the content of the site and the presence or absence of keywords there, the ease of reading the text, the name of the domain, its history and the presence of multimedia content play a much greater role.On December 5, 2013, Yandex announced a complete refusal of accounting the link factor in the future.
Search hints
As the user types the query in the search bar, the search engine offers hints in the form of a drop-down list. Hints appear even before the search results appears and allow you to refine the query, correct the layout or typo, or go directly to the site you are looking for. For each user, hints are generated, including on the history of his search queries. In 2012, the so-called “Smart Search Hints” appeared, which instantly give out information about the main constants, traffic jams, and have a built-in calculator. In addition, a translator was integrated in the “Hints”, the schedule and results of football matches, exchange rates, weather forecasts and more. You can find out the exact time by asking "what time is it." In 2011, Hints in the search for Yandex became completely local to 83 regions of Russia.In addition to the actual search, Hints are built into Yandex search engines. Dictionaries ”,“ Yandex. Market ”,“ Yandex. Maps "and other Yandex services.
The hint function is a consequence of the development of the technology of intent search and first appeared on Yandex.Bar in August 2007, and in October 2008 it was introduced on the main page of the search engine. Available both in the desktop and mobile versions of the site, Yandex shows its users more than a billion search hints per day
History
Changes in the search engine for a long time were not widely represented and remained nameless. And only from the beginning of 2008, when the launch of algorithm 8 SP1 was announced, Yandex announced that henceforth the new ranking algorithms will bear the names of cities.1990s
The name of the system - Yandex, - was invented together by Arkady Volozh and Ilya Segalovich.The word stands for yet another indexer ”. According to the interpretation of Artemy Lebedev, the name of the search engine is consonant with Yandeks, where yang means the masculine beginning,
The yandex.ru search engine was announced by CompTek on September 23, 1997 at the Softool exhibition, although some developments in the field of search were carried out by the company even earlier.
The first index contained information on 5 thousand servers and occupied 4.5 GB.
In the same 1997, the search for Yandex began to be used in the Russian version of Internet Explorer 4.0. It became possible to query in natural language.
In 1998, the function “find similar documents” appeared for each search result.
“Yandex. Search ”as of 1998 worked on three machines running on FreeBSD under Apache: one machine crawled the Internet and indexed documents, one search engine, and one machine duplicated the search engine.
In 1999, a search appeared in the categories - search, a combination of a search engine and a catalog. The version of the search engine was updated.
2000
On June 6, 2000, the second version of the search engine was presented. A parallel search mechanism was introduced, and along with the issuance, information was offered from large sources. Users were able to limit the search results to the selected topic. The heading “Popular finds” appeared - words that refine the search.In December 2000, the volume of indexed information reached 355.22 GB.
2001
In 2001, Yandex overtook another Russian search engine, Rambler, in terms of attendance, and became the leading search engine of Runet. Yandex began to understand requests in a natural language that were asked in interrogative form. The system has learned to recognize typos and suggest correcting them. The design has changed.2002
The number of daily queries to the Yandex search engine exceeded 2 million2003
Indexing.rtf and.pdf documents was launched. Search results began to be issued including in XML format.2004
The ranking algorithm has changed.Yandex began indexing documents in.swf.xls and.ppt formats.
At the end of the year, the study “” was published, which revealed certain ranking details in a search engine.
2005
In summer, the so-called “fast” search robot was launched, working in parallel with the actual pages intended for indexing. The base of the "fast robot" is updated every 1.5–2 hours.The ranking algorithm has been improved to increase search accuracy.
Search capabilities have been expanded with the help of Yandex. Dictionaries ”and“ Yandex. Lingvo ". The search engine has learned to understand queries like “What is in Spanish” and automatically translate them.
It became possible to limit search results by region.
2006
Since May 2006, site icons have been displayed in the search results.In early December, next to each link in the results of search appeared the item “Saved copy”, clicking on which, the user goes to a full copy of the page in a special archive database.
2007
Ranking algorithm changed again.2008
In 2008, Yandex for the first time began to openly announce changes in the search algorithm and strarted to name the changes with the names of Russian cities. The name of the “city” of each subsequent algorithm begins with the letter that the name of the previous one ended with.Yandex Achievements
The precedent when local search companies are not inferior to American brands is almost unique in the world, if we do not take into account the experience of China, where Google was blocked in 2010, mailboxes of human rights defenders are hacked, and local providers often redirect the address to Baidu. It should also be noted that Russia is the only country in the world, with the exception of the United States, which without protectionist measures has created more than one successful search technology with a significant market share.According to media expert Mikhail Gurevich, Yandex is a “national treasure”, a “strategic product”.
This fact was also recognized in the State Duma of the Russian Federation, where in May 2012 a bill appeared in which Yandex and VKontakte are recognized by strategic enterprises as national information translators. In 2009, President of Russia Dmitry Medvedev initiated the purchase of a “golden share” of Yandex by Sberbank in order to avoid an important nationwide company falling into foreign hands.
In 2012, Yandex overtook Channel One in terms of daily audience, which made the Yandex a leader in the domestic media market. In 2013, Yandex confirmed this status, overtaking First in terms of revenue.
In 2008, Yandex was the ninth search engine in the world, in 2009 the seventh, and in 2013 the fourth.
One of the components of this situation is the presence in Russia of a sufficient number of mathematically savvy specialists with a scientific instinct.
By 2002, the word Yandex became so common that when Arkady Volozh`s company demanded to return the yandex.com domain, bought by third parties, the defendant stated that the word "Yandex" was already synonymous with the search and became a household word in Russia.
Since the fall of 2012, the Yandex search engine has outperformed the number of Google users on the Google Chrome browser in Russia.
Logo
The Yandex logo appears in numerous settings to identify the search engine company. Yandex has relied on several logos since its renaming, with the first logo created by Arkady Volozh and debuted in 1997 on Яndex.Site and Яndex.CD products, even before the announcement of the Yandex search engine. The logo was designed analog to the CompTek logo.Since 1997 the logos are designed by Art. Lebedev Studios, — which designed four versions. The current logo using Cyrillic words.