Institute of Informatics and Artificial Intelligence (IIAI)

Department of artificial intelligence systems(AIS)

Speciality Artificial intelligence systems

Models and algorithmic support for the construction of semantic networks in natural language texts

Scientific adviser: Ph.D. (c.t.s)., Sergey Voronoy

Abstract

Соntents

  1. Goals and Objectives

  2. The relevance of the topic

  3. The proposed scientific novelty

  4. Expected practical results

  5. A summary of the results of their own

  6. Conclusion

  7. Literature



1 Goals and Objectives

The aim of research is creating a new, more efficient method of constructing a semantic network of natural language texts.

To achieve this goal it is necessary to solve the following tasks:

- to make a detailed analysis of the previously existing methods of constructing semantic networks;

- to identify major shortcomings and problems of implementation of existing algorithms for constructing semantic networks;

- to determine the degree of scrutiny of these issues;

- to define the scope of existing technologies;

- to determine which set of problems is possible to be solved by the semantic network;

- based upon the analysis of the foregoing points to make conclusions and to propose a new method for constructing semantic networks which is capable to solve the existing problems of this domain fully.

2 The relevance of the topic

With each passing day large amounts of information grow rapidly. But as you know, the information required ordering and processing. Therefore, there is a necessity for a means for storing data and mechanisms for their rapid and efficient processing.

Semantic networks are just the mechanism that can process the information and accumulated knowledge effectively and fully.

The method of knowledge representation in network models is closest to the way they are represented in the texts in natural language. It is based on the idea that all the necessary information can be described as a set of triples (arb), where a and b are objects, and r is a binary relation between them [1].

Semantic Web is an information domain model formed as a directed graph which vertices correspond to objects of the domain and the arcs (edges) define the relationships between them. Concepts, events, properties and processes can be objects.

Based on the above-mentioned we can conclude that the subject of this research is highly relevant as it allow to solve one of the most important challenges of the information society - a problem of processing of the accumulated information and presenting it in a more convenient form for storage.

3 The proposed scientific novelty

Scientific novelty of the research is developing a new method for constructing semantic networks in natural language texts based on the results of a detailed analysis of the previously existing methods and determining of deficiencies and problems of existing methods and algorithms.

4 Expected practical results

As a result of the work the creation of algorithmic support using the developed method is planned. This algorithm must meet the following requirements:

- speed of implementation;

- lack of defects that have occurred in previous algorithms;

- the effectiveness of the work;

- opportunity to optimize the solving of unusual problems.

5 A summary of the results of their own

As a result of the analysis of the previously existing methods for constructing semantic networks major shortcomings and problems of implementation of existing algorithms were highlighted. The degree of scrutiny of the problem and the relevance of the using of semantic networks was also determined. Based on the analysis mentioned above conclusions were drawn. Now we are working on a new method for constructing semantic networks that can solve the problem of the domain as fully as possible.

6 Conclusion

The tasks of texts processing emerged almost immediately after the onset of computer technology. Despite the half-century history of research in artificial intelligence sphere, computational linguistics experience, a huge leap in the development of Internet technologies and related disciplines, most satisfactory solution of practical problems of text processing is not found. However, the IT industry has demanded a satisfactory solution to some problems of texts processing. Thus, the development of data warehouses makes the problem of information retrieving and forming of well-constructed text documents topical. The rapid development of Internet has led to the creation and accumulation of huge amounts of textual information, which requires a full-text search and automatic text classification (software tools to fight spam), and if the first task is solved more or less satisfactorily, solving of the second one is still far away.

In recent years thanks to the development of document management systems, the availability of a set of constantly updated legal reference books and other factors there is the accumulation of specialized arrays (but not formal) text documents. By the analogy with structured information, when the improvement of the analysis resulted in the emergence of data warehousing, document management systems development can eventually require a full-text repositories, enabling comprehensive analysis and study of non-formalized natural language text.

While writing this essay master's work has been not completed yet. The date of final completion is January 2012. Full text of the work and materials about the subject may be obtained from the author or his head after that date.

Literature

  1. Аверкин А.Н., Гаазе-Рапопорт М.Г., Поспелов Д.А. Толковый словарь по искусственному интеллекту. – М: Радио и связь, 1992.
  2. Искусственный интеллект в домашних условиях. Семантические сети. [Электронный ресурс] – Режим доступа: http://www.aimatrix.nm.ru/aimatrix/SemanticNetworks.htm.
  3. Электронная библиотека «Википедия». Семантическая сеть. [Электронный ресурс] – Режим доступа: http://ru.wikipedia.org/wiki/Семантическая_сеть.
  4. Панченко А. Построение семантической сети из разнородных данных. [Электронный ресурс] – Режим доступа: http://it-claim.ru/Persons/Panchenko/presentation2010_sept_final.pdf.
  5. Барсегян А.А. Технологии анализа данных: Data Mining, Visual Mining, Text Mining, OLAP / А.А. Барсегян, М.С. Куприянов, В.В. Степаненко, И.И. Холод. – СПб.: БХВ – Петербург, 2007. – С. 194 – 204.
  6. Уотермен Д. Руководство по экспертным системам: пер. с англ. / Д. Уотермен. – М.: Мир, 1989. – 388 с.
  7. Круглов В.В. Искусственные нейронные сети. Теория и практика / В.В. Круглов, В.В. Борисов. – 2-е изд., стереотип. – М.: Горячая линия – Телеком, 2002. – 382 с.
  8. Мельник К.В., Ершова С.И. Проблемы и основные подходы к решению задачи медицинской диагностики. [Электронный ресурс] – Режим доступа: http://www.nbuv.gov.ua/portal/natural/soi/2011_2/melnik.pdf.
  9. Носова Н.Ю. Семантическая модель содержания инновационного технического проекта. [Электронный ресурс] – Режим доступа: http://www.nbuv.gov.ua/portal/natural/soi/2011_4/nosov.pdf.
  10. Марченко А.А., Никоненко А.А. Контекстный семантический анализ текста. Система текстового мониторинга и качественного оценивания фокусного объекта. [Электронный ресурс] – Режим доступа: http://www.nbuv.gov.ua/portal/natural/ii/2008_3/JournalAI_2008_3/Razdel9/02_Marchenko_Nikonenko.pdf.
  11. Шатохин Н.А. Семантический анализ естественных языков и его приложения. [Электронный ресурс] – Режим доступа: http://masters.donntu.ru/2011/fknt/shatokhin/library/article4.htm.