Українська   Русский
DonNTU   Masters' portal

Abstract

Content

Introduction

Computer Vision Theory is a scientific discipline that aims to detect and identify objects in images or video stream; this refers to the theory of artificial systems creation. Despite the importance of the pattern recognition problem, there are some difficulties in recognizing with a computer because it does not have the ability to determine the relativity of all visible objects: this means it can't keep the base of all forms and variations for each object in its memory.

1. Theme urgency

People absorb around 90% of their information about the world through vision. In this field – of computer technology – a source of information can be text, music, video, or image. In recent years, photography has become a fashionable hobby; this is because of the easy access to hardware (almost every phone has a camera). Typically, one sees an accumulation of photos over time, making it increasingly difficult to search through a catalogue of pictures. Furthermore, image searching also has direct relevance to the recognition, because it can only be performed after image classification.

2. Goal and tasks of the research

The goal of this work is studying the existing approaches to classification in digital image collection. Decomposition of the goal can give a list of proposed research tasks:

  1. studying existing methods of image classification;
  2. creating a test collection of images;
  3. implementation of various image classification methods based on test collection;
  4. annotating each image;
  5. classification of collection in general;
  6. detection the advantages and disadvantages of image classification methods;
  7. implementation of various methods based on test collection;

Object of the research - methods of images classification. Subject of the research - advantages and disadvantages of methods of images classification.

3. A review of researches and developments

Before proceeding to review the methods it is necessary to mention the the image, which will be classified, has one of more partial images (descriptors). These images can be ordered into a set of descriptors, which describe the images, and can vary depending on the method [1].

Recognition based on the solutions theory

The approach is based on use of discriminant functions. Let's imagine that there is a n-dimensional vector of object signs. We're assuming that there is W classes of images. It is required to find W discriminant functions, such if x belongs an image to some class, discriminant function with an index i has bigger value, than others. In the recognition methods based on the comparison, each class is represented by a vector of image signs which is a prototype of this class. The unfamiliar image is attributed to the class which prototype has a set of similar metrics. The elementary approach consists in use of the qualifier based on minimum distance which calculates Euclidean distances between a vector signs of unknown object and each vector of a prototype. The decision on object accessory to to a certain class is accepted on the smallest of such distances. Method of the correlation comparisons consists that there is a standard which can treat the image by a method called sliding window.

One more approach – statistically optimum qualifiers (Bayesian). As well as in the majority of the fields connected with measurement and interpretation of the physical phenomena, probabilistic approaches are important in a problem of image recognition because of the accidents influencing on generation of image classes. It is possible to develop such method of classification which will minimize errors emergence probability. Bayesian approach is classical in the image recognition theory and also it is cornerstone of many other methods. It is guided by the theorem that if density of distribution of classes are known, the algorithm of classification having the minimum probability mistakes is possible to write in an explicit form.

In the considered approaches the essence of training is simple. Training images of each class are used for calculations of parameters of the discriminant function corresponding to this class. After receiving necessary parameter, the structure of the qualifier becomes fixed, and its final quality depends only on that, how well real sets of images answer statistical assumptions which have been initially made at an output of the used method of classification [1].

Single-layer neural network

Though one neuron also is capable to carry out the simplest procedures of recognition, force of neural calculations results is in connections between neurons in networks. The elementary network consists of group of the neurons forming a layer, as shown in figure 1. We will note that top circles at the left are only for distribution of entrance signals. They don't carry out any calculations and therefore won't be considered as a layer. Each element from a set of X entrances is connected to each artificial neuron by separate weight. And each neuron gives out the weighed sum entrances to a network. In artificial and biological networks many connections can be absent, all connections are shown here. There also can be connections between output and input elements in a layer [2].

A single-layer neural network
Figure 1 – Single-layer neural network

Realization of recognition of a circle on a single-layer neural network

We will narrow a task before circle recognition. There is an image collection of circles (monochrome for simplicity) by means of which neural network training will be performed. Each image passes preliminary processing: it divides on segments as if the grid is imposed on the image.

As a result there is so-called matrix of segments of the image which is a matrix mask of the image that means: if at the current segment there is a colour pigment we put in a current position of matrix 1, else – 0. There is prepared matrix of scales and it is initialized by any values ranging from -1 to 1. But, as the theorem of convergence of a perceptron says, regardless of what coefficients are chosen, the network will find the solution with final number of iterations. Matrix of scales has the same size like mask matrix. Thus we need two of these matrices for training of the neural network, a threshold value which will be a question further, and a speed coefficient for training which sets the speed of convergence of data to desirable result.

The essence of training is multiplying of elements of these two matrices and that will be a result which exceeds the threshold which is set personally, for example 0.8. If the result doesn't exceed a threshold, that means that it is necessary to retrain the network.

Animation representation of training of a network
Figure 2 – Process of training of a network
(animation: 7 images, 10 repeat cycles, 142 Kb)

At this animation we can see the process of network training, where is shown F(X) – the evaluation function, Y result of this function which will be compared to gY value. The result of comparison will affect a further outcome: the network will continue training (T(X) with X editing – Δ X) or the result of training (E) will be obtained.

When all weights will be adjusted, that is will be suitable for each representation of a circle in the form of a mask, it is possible to pass to a recognition stage. Occurs in a similar form, only without retraining that is if at the first stage total value of works of elements of a matrix of scales and a mask matrix yields result less demanded, the answer is ready also ndash; before us there will be a conclusion about, whether the circle arrived on an entrance.

Program result
Figure 2 – Result of work of the program

Multilayered neural networks

Larger and difficult neural networks possess, as a rule, and great computing opportunities. Though networks of all configurations what only can be imagined are created, the layer-by-layer organization of neurons copies layered structures of certain departments of a brain. It appeared that such multilayered networks possess great opportunities, than single-layer, and algorithms were developed for their training in recent years. Multilayered networks can be formed by cascades of layers. The exit of one layer is an entrance for the subsequent layer. The similar network is shown in fig. and again represented with all connections [2].

A single-layer neural network
Figure 3 – Multilayered neural network

In the [12] work the algorithm of recognition of the person is described, and it is based on the basis of a neural network with the return distribution of a mistake with preliminary processing of images by a method main a component that helps to make a set of signs of the image uncorrelated.

The [5] work is about face recognition with neural methods. In the[10] work it is a question of search methods taking into account a form and an arrangement of objects in digital collections of images, in particular that by search of graphics in their contents the such apply signs, as colour, texture, form, spatial signs, characteristics essential to visual perception (granularity, contrast).

Neocognitron

Neocognitron represents a self-organizing multilayered neural network. Singularity neocognitron consists that thanks to the organization of layers dynamically the network becomes invariant in relation to a situation or an angle of a recognizable image. The [3] work is about Fukushima neocognitron. The[6] work describes results of neocognitron modeling with optimized time of execution and simplicity of the description algorithms of training and network functioning, and also there is new approach to formation of training images and communications between network layers. In works [9] [11] described cases of image recognition in case of distortions existence, the model and algorithm of neocognitron training is described. The [8] work describes structure and algorithm of neocognitron training and working which is for recognition of the face of any person.

Fuzzy logic

The theory of fuzzy sets operates with qualitative concepts, that is characteristic for human being, at the same time it gives a quantitative assessment, that is characteristic for computer. Thus, it combines advantages of human knowledge operating and the power of computer. The fuzzy logic which forms a basis for realization of methods of fuzzy management, describes nature of human thinking and a way of reflections, unlike traditional formal-logical systems. For this reason use of mathematical means for submission of fuzzy initial information allows to build models which most adequately reflect various aspects of the uncertainty which is constantly present at reality surrounding us [13].

Fuzzy logic is a section of mathematics which is based on basis of concept of a fuzzy set. The idea of an fuzzy set consists that its elements enter it with the accessory function which value can vary from 0 to 1, that means any degree of confidence. Fuzzy logical reflections can be presented in the form of a neural network and are often used for the solution of a problem of a recognition of images.

In the [13] work process of image recognition (for example, identification of person) with use of mathematical apparatus of fuzzy logic is considered. It is necessary to provide at least three main stages for creation of object recognition system: improvement of image quality by noise components filtration, segmentations or clusterings of objects which are present on image, and, at last, classification of image. It should be noted that resulting recognition depends on quality of each of these stages and if there is negative result at the previous stage, the subsequent stages will strengthen this mistake.

One more important point which needs to be noted is that input data plays a great role at a stage of image classification. If this set is superfluous or, on the contrary, insufficient, it also will be reflected on recognition quality. As a rule, just before the process of classification procedure of allocation of characteristic signs in entrance information carries out, that means allocation of the most significant information and ignoring insignificant [13].

Conclusions

According to information dissemination on the Internet neural network methods of image recognition have great popularity. Fuzzy logic also is has powerful tools for creation of intellectual hardware-software systems of image recognition. Also it is important for image to be good prepared for recognition, noises should be removed, and the system of recognition has to understand a difference between classes of images very well, so they have to be clearly divided.

While writing this thesis, the masters work is not complete. Final end: January, 2015. Full text of this work and materials on a subject can be obtained after the specified date.

References

  1. Р. Гонсалез, Р.Вудс Цифровая обробка изображений: Пер. с англ. – М.: Издательский дом Техносфера, 2005. – С. 1073.
  2. Основы искусственных нейросетей [Электронный ресурс]. – Режим доступа: http://neural networks.chat.ru/foundations.html
  3. С. А. Терехов Неокогнитрон Фукушимы [Электронный ресурс]. – Режим доступа: http://www.masters.donntu.ru/2004/kita/stryukov/...
  4. Д. Г. Мурадина, Н. С. Костюкова Исследование методов классификации коллекций цифровых изображений. Информационные управляющие системы и компьютерний мониторинг (ИУС КМ – 2014) – 2014 / Материалы V международной научно-технической конференции студентов, аспирантов и молодых ученых. – Донецк, ДонНТУ – 2014, Том 6, с. 262-265.
  5. Д. В.Брилюк, В. В.Старовойтов Распознавание человека по изображени лица нейросетевыми методами [Электронный ресурс]. – Режим доступа: http://goo.gl/CHJzCn
  6. Р. Х. Садыхов, М. Е. Ваткин Алгоритм обучения нейронной сети неокогнитрон для распознавания рукописных символов распознавания рукописных символов [Электронный ресурс]. – Режим доступа: http://neuroface.narod.ru/files/neocog_hand_writ.pdf
  7. Александра Вагис, Анатолий Гупал Эффективность байесовских процедур распознавания [Электронный ресурс]. – Режим доступа: http://www.foibg.com/ibs_isc/ibs-15/ibs-15-p11.pdf
  8. А. О. Сова Распознавание человека с помощью нейронной сети типа неокогнитрон [Электронный ресурс]. – Режим доступа: http://masters.donntu.ru/2011/fknt/sova/...
  9. Ю. С. Махно Распознавание графических образов с помощью нейронной сети типа неокогнитрон с помехами [Электронный ресурс]. – Режим доступа: http://masters.donntu.ru/2008/fvti/makhno/...
  10. М. Ю. Похиль Методы поиска с учетом формы и расположения объектов в цифровых коллекциях изображений [Электронный ресурс]. – Режим доступа: http://masters.donntu.ru/2008/fvti/pohil/...
  11. К. В. Дрига Распознавание зашумленных и искаженных образов с помощью неокогнитрону [Электронный ресурс]. – Режим доступа: http://masters.donntu.ru/2006/fvti/driga...
  12. Hemant Singh Mittal, Harpreet Kaur Face Recognition Using PCA & Neural Network [Электронный ресурс]. – Режим доступа: http://www.ijese.org/attachments/File/v1i6/F0266041613.pdf
  13. В. П. Полторак, Я. Ю. Дорогой Система распознавания образов на базе нечеткого нейронного классификатора [Электронный ресурс]. – Режим доступа: http://aaecs.org/poltorak-vp-dorogoi-yayu-sistema...