RUS

 

Comparison of Efficiency of Some Search Engines on The Internet

Haitham Abbas Khalaf

Introduction

         Search engine, a website offering information retrieval on the Internet. Most search engines looking for information on the World Wide Web sites, but there is also a system that can search files on the servers, products online, as well as information in the Usenet newsgroups. Recently, a new type of search engines based on RSS technology, as well as to manipulate a different type. Software that provides the functionality of the search engine called poisko?vy dvizho?k or poisko?vaya mashi?na. Most search engines employ methods are relevant, comprehensive base of the morphology of the language. Indexing information is special search robots. The main problems in the search engines described in the article Deep Wide Web. Improving search engine is one of the priorities of today's Internet. First search engine for the World Wide Web was "Wandex" no longer existing index, which created the World Wide Web Wanderer "- bot, developed by Matthew Gray (Eng Matthew Gray) of the Massachusetts Institute of Technology in 1993. Also in 1993, a search engine "Aliweb" working so far. The first full (ROM. "Crawler-based" - that is indexing resources through a robot), the search engine was "WebCrawler", launched in 1994. Unlike their predecessors, it enables users to search for any keyword on any web page, since then, it has become a standard in all the major search engines. Moreover, it was the first search, which was known in the wider. In 1994 launched the "Lycos, developed at the University of Carnegie Melone. Soon, a host of other competing search engines, such as Excite "," Infoseek "," Inktomi "," Northern Light "and" AltaVista ". In some ways they competed with popular internet such as Yahoo! ". Later directories joined or added to their search engines to increase functionality. In 1996, Russian-speaking Internet users to get access to the expansion of morphological search engine Altavista and the original Russian search engines and Rambler Aport. September 23, 1997 was opened search engine Yandex. In addition to search engines for the World Wide Web existed and searching for other protocols such as the Archie search for the anonymous FTP and "Veronica" to search Gopher.

1 Major search engines

         There are many different search engines. Chief among them :
         1) Google (Russ. Google or Google, NASDAQ : GOOG, LSE : GGEA) - Common name American company Google Inc. , It www.google.com and search engine, located at the site. Google-distorted spelling of the word "googol" (OK), invented by Milton Sirotta, nephew of American mathematician Edward Kasner (Edward Kasner), to indicate the number of a unit of private ownership. The word "google" has been translated from Arabic as "taraschit eye" because of the search activity of Google. The company incorporated as Google Inc. , based in Mountain View (California). The inventor Kleiner Perkins Caufield & Byers and Seqoia Capital. The technological innovation, Google became the host of awards, including the prize "Glas Naroda for outstanding technical achievements and awards for the best search engine on the Internet" by Yahoo! Internet Life. Google won a prize for "Technical Excellence" and PC Magazine "Best search engine journal" The Net. More companies, including AOL (Netscape) and the Washington Post used Google search technology on their web sites. August 19, 2004 began selling its shares in the stock market (IPO), that is a public company NASDAQ : GOOG. Differing almost 20 million shares worth 1.67 billion dollars. Indeed Google has received only $ 1.2 billion. The company was valued at nearly : Google there are more than 250 million shares, which it is free to dispose of their choice. Google issued shares of two types : normal (Class A total 33.6 million), which are overlapping in the NASDAQ system, and preferred (Class B-237.6 million), which restricted circulation "wall" company. The terms of which are confidential. August 30, 2004 at specialized trading floors began bidding options Google. For details, see IPO on Google's Initial Public Offering Information on Jan. 7, 2007 most popular site on the Internet turned 9 years old. Nine years ago, in 1998, Google has opened the doors to its office in Menlo Park, California. Online search engine leader Google has more than 70% of the world market, as a privately held corporation in the network and manages the Internet Google search engine on the Internet. Now register some 50 million daily searches and indexes more than 8 billion pages. Google can find information on the 101 version. Google at the end of August 2004 consisted of 132 thousand machines located in different parts of the world (source-etc). Google contractors, are limited to holders, file types, etc. For example, a search for "intitle : Google site : wikipedia.org" will all articles paintings in all languages, which appears in the title word "Google" [1]. The full, so Google in Russian is here. Information public on queries Google in Russian and examples can be found on the same page. 2) "Yandex" The word "Yandex" (consisting of a characteristic Cyrillic letter "I" and the word index. obygran the fact that the Russian pronoun "I" with the English "I") came Ilya Segalovich, one of the founders of Yandex, currently holding the post of Technical Director. The title "Yandex" / "Yandex" is a clear accident with the first name search engine "Wandex". Yandex Search enables you to search for documents on Runetu Russian, Ukrainian, Belarusian, English, French and German on the basis of morphology Russian and English words and proximity to the proposal. Since the beginning of 2006, "Search" Yandex set for us Mail.ru. In addition to Web pages in HTML format, Yandex indexes documents in PDF (Adobe Acrobat), RTF (Rich Text Format), DOC (Microsoft Word), XLS (Microsoft Excel), PPT (Microsoft Power Point), SWF (Macromedia Flash), RSS (blogs and forums). The feature-Yandex to fine tune search. This is achieved through a flexible language queries. Default Yandex brings to 10 per page of the issue, in your search results [1] increases the size pages to 20, 30 or 50 entries. Sometimes the order of sites on these pages may not be the same as the updating of these results is not at the same time. If the query has a lot of references, the results suggested limit the search by region (ie IP range), or by date. If a word or words do not match, suggested that his / their similar (because the options depend on the frequency of related words, sometimes funny situations). Also, the correct word recruited in a different keyboard layout. From time to time algorithms Yandex responsible for the relevance of the issue is changing, leading to changes in the results of searches. Recent changes announced officially in March 2004?. , April 2005?. and the January schedule. ; to unofficial data, there are many more. In particular, these changes are intended to search spam, leading to unwanted results for some queries (less for a family requests). A search spam otseivaemogo not automatically apply semiautomatic and manual moderation issue (with the help database. White Optimizers "), as well as the failure of indexation" malicious "sites. 3) Yahoo! (NASDAQ : YHOO) is an American company that offers a number of Yahoo! Directory; Server is a popular web Yahoo! Mail, one of the oldest and most popular on the Internet. Not long ago, it launched a new version of Mail interface, based on AJAX (see review of a new Russian-language interface). The Yahoo! was founded by Stanford University graduate students David Filo (Eng David Filo) and Jerry Yang (en Jerry Yang), in January 1994; a corporation on March 2, 1995. The business is located in the city Sanniveyl (en Sunnyvale), California, USA. According to Alexa Internet, and Netcraft, today Yahoo! The first visited site on the Internet. Yahoo! handle 3.4 billion Web requests per day (in October 2005). Early history (1994-1996). The origin of the name. In January 1994 Stanford University graduate students David Filo (Eng David Filo) and Jerry Yang (en Jerry Yang), a web site, called "Jerry's Guide to the World Wide". "Beacon" was a catalogue of other sites. In April 1994, the site was renamed Yahoo! . There are two versions of the title. The first word was derived from a book by Jonathan Swift "Travelling Gullivera" and means "blunt", "uncouth". Soglastno second, Yahoo! is an acronym formed from the phrase "Another hierarchical uncouth (informal) serpent" (en Yet Another Hierarchical Officious Oracle). URL is as follows : http://akebono.stanford.edu/yahoo But by the time Yahoo has been registered trademark barbecue sauce, so the name was added the exclamation mark. Yang and Filo quickly assess the commercial potential of the project and on March 2, 1995 Corporation, founded Yahoo! In the late 1990s, 20's biggest searching, such as MSN, Lycos, Excite and Yahoo! grow up with great speed. To ensure that users have more time on these portals, introduced many new services. March 8, 1997 Yahoo! RocketMail service is one of the first free postal services. Thus came into service Yahoo! Mail. In addition, Yahoo! acquires services ClassicGames.com, which becomes the basis for Yahoo! Games, and eGroups, who later became Yahoo! Groups. Finally, on July 21, 1999 Yahoo! enters service for instant messaging Yahoo! Messenger. February 7, 2000 Yahoo.com was DDoS attack and suspended for a few hours work. Among other significant events "boom period" are accelerating announced merger of media companies Yahoo! and eBay. And although the transaction did not take place, the company agreed to a marketing alliance six years later, in 2006. Yahoo! was one of the few major Internet companies who are survivors of the crash "accelerating". Once out of the crisis (September 26, 2001 the stock price Yahoo! Reached a historical low of $ 8.11) Yahoo! undertook the telecommunications market. June 3, 2002 Yahoo! and SBC launched on the American market National Dialup service, and August 23, 2005 in conjunction with Verizon Yahoo! launches nationwide DSL service. In late 2002, Yahoo! starting acquisition of other search engines, Inktomi and in 2003, Overture services, Inc. , AltaVista and All The Web. February 18, 2004 Yahoo! cease using Google search technology and moves on its own. In 2005-2006, Yahoo! Services launched Yahoo! Music, Flickr and Yahoo! 360 ° and has a number of social services, blo.gs, Upcoming.org, and del.icio.us webjay.

        2 As Aport search engine designed

         For any user search engine consists of two components : a search page and page of search results. The last important, because it demonstrates Account with the solvency of the system. Well, here's an example of a model search purposes. Quick comments :
         1. Bookmarks, which can switch between different types of search;
         2. The reference to the results of the extensive resources for a given query (in parentheses indicates the number of discovered news);
        3. Reference to the news most relevant given query;
        4. The number of the query;
         5. Title and reference to the site found;
        6. Site Description drawn editor (imported from Aport-kataloga);
         7. Name and address most of (the relevant) search of the site;
         8. Quotes from the full text of providing query;
        9. The reference to the saved text (useful if the site is not accessible through the Internet);
        10. Address results site;
         11. Activities in the directory on request;
         12. The reference to the results on the site (all found pages);
         13. Country or region Russia, which has found a site. With a click on the link will be a search for a limitation on the search sites from the region;
         14. The reference to the entry Aport-kataloga to which found the site (if the site is published in the catalogue);
         15. Country or region Russia, which is your IP address. With a click on the link will be a search for a limitation on the search sites from the region;
         16. Sponsored Links matching (contextual advertising).

        The first thing we see, but the number of entries, the number of commercial issues discovered sites. This is not just a formal procedure, the whole issue further divided at the sites, rather than documents. This does not mean that you can now find some of the results-arranged so that the combined general information and details. Many search engines today, in one way or another, rely on the concept of the site, but implied by this simple mail server type www.server.com. The site is a simple page addresses cutting the tail: a http://www.server.com/users/ ~ vasya produced www.server.com site. For large servers where the sites of many companies and people are wrong decision. Please take a map server only as a last resort. Generally, to determine which group of pages is a logical whole (site), please use information from the database of its directory of sites which introduces people, and so much better than what is any automatic algorithm (special algorithms are used, but only if the site is not listed in the catalogue). Please give a very informative presentation found in the search pages. In each block Aport site provides information on one of the most relevant web pages on the site (7) : address, title, date, and quoted from a document (8). It is important that quotations selected from the full text of the document containing the query words. Also referred to the reconstruction of the full text (9). She needed if the document is not available at the site (fell server, the document is deleted, etc.) If you want to get information on all the other pages that Aport found on the site, there are links (12), which closes unit results. This link will be issued an additional window, which displays search results only on the site. They consist of blocks of data on individual pages. The ranking of search results is crucial in terms of the quality of searches. Develop good as scoring a very easy task, in particular, because of the heterogeneity ranzhiruemyh documents and attempts by deliberate distortion results through search spam. A powerful tool for improving the quality ranking was the hypertext structure of the Internet :
         referential index ranking and link to (although not always) distinguish the quality of content similar to the content of the debris, as well as (this is especially important for owners of sites) original copies of materials. But here are facing the same problems : uneven link structure and the deliberate distortion of spammers. Another important way of increasing relevance is the use of commercial information in the directory, which has a high degree of reliability, as a professionally trained or tested editors. A principal in the ranking of search results in the commercial is an attempt to integrate the maximum number of ranking criteria in their relationship. In particular, a marked advantage to the documents, with a high proportion in several independent criteria (for example, the frequency of search terms in the text and referential ranking). The ranking is purely automatic means, we do not make ad hoc adjustments to the results of any query or sites.

Conclusion

        Many search engines based on the well-known methods and algorithms developed in a pre-Internet. The overall objective search on the Internet is to find documents, the information needs of the user. Ten years ago, the problems of finding the necessary information from the Internet user is not required. However, over time, the situation has changed and now we do not always understand the need to make a request for searching. Consequently, it is necessary to alter the task of seeking and developing new methods of data to narrow your search. One method of allowing the user to search on the Internet, is the clustering of documents. Systems for clustering of the English language, have been implemented by Western experts several years ago. This algorithm works and to search for documents in Russian. The advantage of the intellectual search Nigma.ru is that the algorithm is its documents on the basis of Russian orthography, brings together the results from different search engines uses users to improve the quality of clustering and counters to sorting of search results, and adjust for errors.

Bibliography

        1) Lee, T., "Uniform Resource Locators" 1 January 1994.
        2) Yuwono, B., And Lee, D. L., "Search And Ranking Algorithms For Locating Resources On The World Wide Web in 1996.
        3) L. Ericksen, Web Page Creation and Design, 2nd edition, 2001.
        4) Thelwall, "The Responsiveness Of Search Engine Indexes." (2001).
         5) How Google Works http://www.googleguide.com/google_works.html 2005.
        6) Saba Abd Al-khaliq "and Arabic Internet Search Engines" (2002).
        7) S.M.H. Collin, Dictionary of personal computing And The Internet, second edition, 1998, Peter Collin Publishing