UA   RU
DonNTU   Masters' portal

This abstract refers to a work that has not been completed yet. Estimated completion date: June 2018. Contant author or his scientific adviser after that date to obtain complete text.

Abstract on the theme of master's work Designing and implementing an intelligent meta-search system for finding quotes

Content

1. Theme urgency

Today, the main problem for many people every day is the problem of finding information on the Internet. In recent years, there has been a trend towards the emergence of highly specialized search services. The separation of the task of searching information for subtasks allows the introduction of new search methods and significantly improve its effectiveness. There are few solutions that provide search on the Internet for texts.

If the text is not so common, and/or the network libraries known didn’t found, user is forced to use the services of search engines. The user enters meta-text into the search engine interface (IPS) and receives several hundred or thousands of links in response. Some of which leads to the sites of stores where you can buy the corresponding book, some of these links will be entered into the bibliography and/or mention, some of these links are just information noise. And, finally, some of the links can enter the text itself. Specialized IPS cuts off a significant part of obviously irrelevant results, so the creation of specialized IPS for a particular task is a more effective solution [1].

2. The purpose and objectives of the study, the planned results

The aim of the work is the design and implementation of an intelligent meta-search system for finding quotes.

Objectives of the study:

3. Review of research and development

The topic under investigation is popular not only in international but also in national scientific communities.

3.1 Overview of International Sources

In the book of Christopher D. Manning, Prabhakara Raghavan, Heinrich Schutze Introduction to Information Search [2] along with the classic search, web search is considered, as well as the classification and clustering of texts.

The book of Charalambos Marmanis, Babenko Dmitry Intelligent Internet Algorithms [3] on how to build algorithms that form the intellectual core of such web applications.

3.2 Overview of national sources

In the article of Gennady Osipov, Ivan Tikhomirov, Igor Smirnov Intellectual search in global and local computer networks, and databases [4] tells the methods and tools of semantically relevant meta-search.

In the article of the above authors Relational-situational method of searching and analyzing texts and its applications [5] briefly describes the relational-situational method of analysis of texts of natural language.

In the book by Dmitry Lande, Andrei Snarsky, Igor Bezsudnov, Internet and Navigation in Complex Networks: Models and Algorithms [6] issues related to the information structure of web space, the theory of complex networks, models of information retrieval and in-depth analysis of texts, general patterns modern information flows and their modeling.

In the tutorial of previous authors Modeling of complex networks [7] the basic problems of the theory of complex networks are considered.

Book of Alexander Dodonov, Dmitry Lande, Aladimir Putyatin Computer networks and analytical studies [8] is devoted to the theoretical and technological foundations of systems for supporting analytical research in the global network environment, methods and means for monitoring, aggregating and generalizing information flows of large volume in computer networks.

3.3 Overview of local sources

In the individual section of Kalamitra M.V, devoted as a course work to develop an application for meta-search in the Internet Palaces of Crimea [9].

4. Properties of meta-search systems and approaches to their implementation

4.1. Meta-search engine architecture

The meta-search system is built on the principles of client-agent-server architecture with an unattended client, where the client is a standard Web browser, the agent is a meta-search engine, and the server is a Web server of the so-called Virtual Library, to whose search engines refers to the agent. The virtual library combines electronic catalogs, an intelligent search system and client sites [10].

When designing a meta-search system, you need to solve some problems.

First, from the set of documents received from search engines it is necessary to select the most relevant, that is, corresponding to the user's request.

In addition, you need to reduce the used computing resources of the meta-search server without overloading it with too much unnecessary information and seriously save traffic [11].

4.2. Defining a meta-search task

A meta-search engine is a search tool, which sends a user's query to several search engines and directories [12].

The principle of the meta-search engine is the following: the user's query is transformed into queries, which are formatted syntactically and logically in the constructs optimal for each individual, traditional search engine. From one query, the meta-search engine makes a series of queries that are addressed to several ordinary searches [13]. After collecting the results, the meta-search engine removes duplicated links and, in accordance with its algorithm, combines the results in a common list.

Meta-search engines are not designed to index and store data, their purpose is to cleanly search and process search results [14].

Illustration №1 shows the general scheme of the operation of meta-search systems.

Illustration 1 — General scheme of the operation of meta-search systems

Illustration 1 — General scheme of the operation of meta-search systems

4.3. Solution of the meta-search task for finding quotes

Searching for quotes is searching the text for a given fragment.

The user who asked such a query most likely wants to find the origin of the quotation - that is, either to see the work from which it was taken, or at least to recognize the author and the title of this work.

Let’s consider this problem in more detail, and also introduce some restrictions and definitions.

  1. Search of texts on the Internet means the situation when the user knows the name of the work and/or its author (name-surname), and as a result the user wants to receive the full text of this work in electronic form.
  2. Text will be considered a complete linguistic work, characterized by the presence of the author and title.
  3. For examples we will use literary works in Russian, although the search methods are applicable to texts of any genre and subject (technical, journalistic, etc.).

The standard solution to the problem of searching for texts on the Internet is to create systems that index the texts found on the Internet. In such systems, the user enters meta-text into the standard search interface and, if this text is indexed, the user gets the address of the text by which the text was found during the indexing process.

There is a way to search for texts on the Internet, call it a search by quotation or a citation search. The basic idea is that general purpose IPAs provide an opportunity to specify a whole phrase as a query, and the result of such a query will be only those documents in which this whole phrase is present with the word order preserved. Thus, if user knows quote from text instead meta-text, the further process of text searching will be reduced to entering this quote in Google or Yandex. Then the search engine provides links or signals that there is no such text in Internet.

The meta-search engine for finding quotes will include several stages.

  1. The user makes a request with a quote.
  2. The system parses the request.
  3. The results of parsing fall into the block of semantic analysis.
  4. Based on the results of syntactic and semantic analysis, using dictionaries of associations, synonyms, the system generates several queries that are variations of the source.
  5. The system sends the received requests to standard search engines, for example, google, yandex.
  6. The system analyzes the result of search engines, selecting the most suitable sources of citations, and displays them on the screen to the user [17].

Illustration 2 shows the scheme of searching quotes algorithm.

Illustration 2 — Scheme of searching quotes algorithm

Illustration 2 — Scheme of searching quotes algorithm

Conclusions

The analysis of sources has shown that the theme of designing and implementing meta-search systems is relevant not only in international but also in national and local scientific communities.

The main requirements for meta-search systems have been put forward and described: the operation principle of meta-search systems, their advantages and disadvantages, and also a scheme of own algorithm for searching for quotes from its fragmented task.

Further work will be directed to the development of meta-search system schemes as requirements and loads to the system change, as well as to develop an application that implements the minimum functionality, the examples of existing meta-search systems needed to model and study the application response to the emerging load.

List of sources

  1. А.С. Гребеньков. Поиск текстов в Интернете на основе базы цитат. X Всероссийская объединенная конференция, с. 258-260 — [Электронный ресурс]. — Режим доступа: http://ict.edu.ru/vconf/files/7877.pdf
  2. Кристофер Д. Маннинг, Прабхакар Рагхаван, Хайнрих Шютце. Введение в информационный поиск — [Электронный ресурс]. — Режим доступа: https://www.ozon.ru/context/detail/id/5497130/
  3. Хараламбос Марманис, Дмитрий Бабенко. Алгоритмы интеллектуального Интернета. Передовые методики сбора, анализа и обработки данных — [Электронный ресурс]. — Режим доступа: https://www.ozon.ru/context/detail/id/6753996/
  4. Г.С. Осипов, И.А. Тихомиров, И.В. Смирнов. Интеллектуальный поиск в глобальных и локальных вычислительных сетях, и базах данных. Программные системы: теория и приложения. Переславль-Залесский, 2004, 21-34 — [Электронный ресурс]. — Режим доступа: http://docplayer.ru/27455876-Intellektualnyy-poisk-v-globalnyh-i-lokalnyh-vychislitelnyh-setyah-i-bazah-dannyh.html
  5. Г.С. Осипов, И.А. Тихомиров, И.В. Смирнов. Искусственный интеллект и принятие решений, Реляционно-ситуационный метод поиска и анализа текстов и его приложения, 2008, №2, 3-10 — [Электронный ресурс]. — Режим доступа: http://docplayer.ru/29580361-Relyacionno-situacionnyy-metod-poiska-i-analiza-tekstov-i-ego-prilozheniya.html
  6. Ландэ Д.В., Снарский А.А., Безсуднов И.В. Интернетика. Навигация в сложных сетях. Модели и алгоритмы — [Электронный ресурс]. — Режим доступа: http://poiskbook.kiev.ua/art/internetica/
  7. Ландэ Д.В., Снарский А.А. Моделирование сложных сетей — [Электронный ресурс]. — Режим доступа: http://freescb.info/sites/freescb.info/files/mss-new.pdf
  8. А.Г. Додонов, Д.В. Ландэ, В.Г. Путятин. Компьютерные сети и аналитические исследования — [Электронный ресурс]. — Режим доступа: http://dwl.kiev.ua/art/ksai/an-book.pdf
  9. Каламитра Марина Викторовна. Метапоисковая система Дворцы Крыма — [Электронный ресурс]. — Режим доступа: http://masters.donntu.ru/2013/fknt/kalamitra/ind/index.htm
  10. Саркисова И.О. Автоматизация поиска неиндексируемых ресурсов в распределенных компьютерных сетях — [Электронный ресурс]. — Режим доступа: http://magazine.stankin.ru/arch/n_10/14/index.html.
  11. Архитектура метапоисковых систем — [Электронный ресурс]. — Режим доступа: http://citforum.ru/internet/search/metaping.shtml
  12. Мета-поисковые системы — [Электронный ресурс]. — Режим доступа: http://catalysis.ru/link/index.php?ID=12&SECTION_ID=54
  13. Мета-поисковые системы — [Электронный ресурс]. — Режим доступа: http://www.vsepoisk.ru/2009/07/blog-post_23.html
  14. Метапоисковые системы: принципы работы, опыты кластеризации поисковых результатов — [Электронный ресурс]. — Режим доступа: http://life-prog.ru/2_10898_metapoiskovie-sistemi-printsipi-raboti-opiti-klasterizatsii-poiskovih-rezultatov.html
  15. Метапоисковые системы — [Электронный ресурс]. — Режим доступа: https://studopedia.org/11-95698.html
  16. Шпаргалка по метапоисковым системам — [Электронный ресурс]. — Режим доступа: http://internetno.net/category/shpargalki/meta-search/
  17. Серёженко О.А., Коломойцева И.А. Применение мета-поиска к решению задач поиска цитат // Программная инженерия: методы и технологии разработки информационно-вычислительных систем (ПИИВС-2016): сборник научных трудов I научно-практической конференции (студенческая секция). 16-17 ноября 2016 г. — Донецк, ГОУ ВПО Донецкий национальный технический университет, 2016. — с. 194-200.
На верх