Alexander Kalinin

Institute of Informatics and Artificial Intelligence (IIAI)

Department of artificial intelligence systems(AIS)

Speciality Artificial intelligence systems

Models and algorithmic software for automation of ontology building

Scientific adviser: Ph.D. (c.t.s)., Sergey Voronoy

Abstract

Contents

  1. Goals and Objectives

  2. The relevance of the topic

  3. Expected scientific novelty

  4. Expected practical results

  5. Global overview of research and development

    1. Approach based on lexical and syntactic patterns
    2. The approach, based on production systems
    3. An approach based on statistical methods
  6. National survey of research and development

  7. A local review of research and development

  8. Own results

  9. Conclusion

  10. Literature



1 Goals and Objectives [to contents]

The aim of this research is developing of the methods and algorithmic software for automatic construction of ontologies. Implementation of this goal is possible if following tasks will be performed:

- reviewing and studying of the basic features of semantic technologies, the prospects for their development;

- to analyze methods, models and algorithms of ontologies constructing;

- to identify problem areas of ontologies developing.

2 The relevance of the topic [to contents]

The development of knowledge-based industries of human activity in modern society is accompanied by the increasing role of computer technology. Nowadays the flow of information is greatly increasing; there is a necessity to find new ways of its storing, representing, formalizing and systematizing as well as automatic processing. Systems that can extract any information from the text without human intervention (semantic links) excite huge interest. As a result, against the background of the emerging needs new technologies are developing to solve the stated problem. Along with the World Wide Web its expanse, Semantic Web, appears where the hypertext pages are provided with additional markings, carrying information about the semantics of the elements that are included into the page. An integral component of the Semantic Web is the concept of ontology, which describes the content of semantic markup.

Ontologies are a convenient tool for presentation and storage of knowledge, so the development of an algorithmic framework for creating, updating and supporting of ontologies is a very urgent task at present.

3 Expected scientific novelty [to contents]

In this research work, as part of ongoing research, it is expected to review existing methods for building ontologies, to identify their weaknesses and to propose a new method for automatic construction of otology.

4 Expected practical results [to contents]

In practical terms, the ongoing research should bring results in the form of an explicit algorithm for automatic construction of ontology, taking into account the shortcomings of the previous algorithms.

The algorithm should correspond to the following requirements:

- self-sufficiency;

- high efficiency;

- maximum simplicity;

- easy to use.

5 Global overview of research and development [to contents]



5.1 Approach based on lexical and syntactic patterns [to contents]

This approach refers to a group of automatic construction of ontologies using linguistic tools. To build ontology one should actively use all levels of analysis of natural language: morphology, syntax and semantics. Thus, for the automatic construction of ontology the author uses one of the methods of semantic analysis of texts in natural language - lexical and syntactic patterns [2].

5.2 The approach, based on production systems [to contents]

This approach refers to a group of automatic construction of ontologies, which are based on approaches from the field of artificial intelligence. Effective automatic construction of ontologies can be based on the ability of artificial intelligence methods to withdraw the elements of knowledge and its non-trivial processing from the text. The analysis of natural language processing of the text shows the prevalence of using the different rules to solve problems in the subject area. This fact as well as declarative representation of the methods of automatic construction of ontologies justifies using of production systems as a model for representing knowledge about the method of [4]

5.3 - An approach based on statistical methods [to contents]

1. Preparation of the collection

1.1 Leading the documents to a common format.

1.2 Tokenization.

1.3 Stemming (lemmatization).

1.4 Removal of stop words.

2. Determination of the ontology classes.

3. Defining relationships between classes.

6 National survey of research and development [to contents]

At the national level research in the field of ontology is well represented in the article of V. Lytvyn «The tasks of optimizing of the structure and content of the ontology and the methods of their resolving». [10].

7 A local review of research and development [to contents]

Donetsk National Technical University also makes an elaboration in the sphere of ontologies construction. In the article of A.V Grigoriev, E. Pawlowski «Analysis of the methods of ontologies constructing to build expert systems for the synthesis of the models of complex systems in a CAD system» examines existing ontologies in different domains, it is proposed ontological approach for creating websites. [11].

8 Own results [to contents]

At this stage of research in the field of ontologies construction the main issues of the theme, the advantages and disadvantages of existing methods are highlighted. Researching of the improving of the methods of automatic ontologies construction are under way.

9 Conclusion [to contents]

Methods of automatic ontology construction presented in this research work provide a wide selection for the developers, however, it should be noted that these methods are not without drawbacks. The using of production rules provides the following advantages: simplicity and high performance, modularity, ease of modification. Among the shortcomings a lack of connectivity between the semantic rules can be identified. Approach based on lexical and syntactic patterns is universal and it is its merit, but is not very effective in the case of a small volume of the collection. The statistical approach is also quite versatile and does not limit the application of the method only the Russian language, the approach allows to select only the basic relationships which are necessary to build an ontology that is its fault. The next stage of the research, based on this perspective, is proposed to be the improving of the existing methods to obtain more adequate results.

Literature [to contents]

  1. Н.С. Константинова, О.А. Митрофанова Онтологии как системы хранения знаний. [Электронный ресурс] – Режим доступа: http://window.edu.ru/resource/795/58795/files/68352e2-st08.pdf
  2. Рабчевский Е. А. Автоматическое построение онтологий на основе лексико-синтаксических шаблонов для информационного поиска / Е.А. Рабчевский // В кн.: «Электронные библиотеки: перспективные методы и технологии, электронные коллекции»: сб. науч. тр. 11й Всероссийской научной конференции RCDL-2009. – Петрозаводск, 2009. – С. 69–77.
  3. Marti A. Hearst, Automatic Acquisition of Hyponyms from Large Text Corpora // Proceedings of the 14th conference on Computational linguistics - Volume 2, Pages: 539 - 545 , Nantes, France, Association for Computational Linguistics, Morristown, NJ, USA, 1992.
  4. Найханова Л. В. Методы и модели автоматического построения онтологий на основе генетического и автоматного программирования / Л. В. Найханова. – Красноярск, 2008. – 36 с.
  5. Weiss, S. M., Indurkhya, N., Zhang, T., and Damerau, F. J. Text Mining: predictive methods for analyzing unstructured information. Springer, 2005.
  6. Dobrynin, V., Patterson, D. W., and Rooney, N. Contextual Document Clustering. [Электронный ресурс] – Режим доступа: http://www.sophiasearch.com/uploads/documents/contextual_document_clustering.pdf
  7. Syafrullah, M., and Salim, N. Improving Term Extraction Using Particle Swarm Optimization Techniques. // Journal of Computer Science. 2010. Vol. 6. № 3. Pp. 323–329.
  8. Мозжерина Е.С. Автоматическое построение онтологии по коллекции текстовых документов. [Электронный ресурс] – Режим доступа: http://rcdl.ru/doc/2011/paper45.pdf.
  9. Свами М. Графы, сети и алгоритмы / М. Свами, К. Тхуласираман. – М.: Наука, 1984.
  10. Литвин В.В. Задачі оптимізації структури та змісту онтологій та методи їх розв’язування. [Электронный ресурс] – Режим доступа: http://www.nbuv.gov.ua/portal/natural/Vnulp/ISM/2011_715/19.pdf
  11. Григорьев А.В., Павловский Е.В. «Анализ методов построения онтологий для построения экспертных систем по синтезу моделей сложных систем в САПР». [Электронный ресурс] – Режим доступа: http://www.nbuv.gov.ua/portal/natural/Npdntu_ota/2011_21/article_21_13.pdf