Українська   Русский
DonNTU   Masters' portal

Abstract

Content

Introduction

The problem of recognition of persons occurs in many areas. But this is an extremely complex task, requiring enormous technical means. To simplify it's tasks, usually divided into several simpler. In this case there are two basic tasks: face detection and face recognition. But even this division does not solve the problem. For their solution using algorithms of varying complexity. Identified persons are also complex. For its solutions are used by different approaches.

There are two ways in facedetection algorithms by different relevant to this issue[1]. The first approach considers the person as diverse set of features that characterize a person as an object among other objects[1]. Such characteristics can serve as skin color, because it varies in some range, eyebrows, eyes, the form of the skull, etc. The second approach is based on the consideration of the picture with a face like images, abstracting from the face detection, extending it to the detection of any object in the photo.

The simplest algorithms of detection and recognition of persons use features of the image. The main advantage of such algorithms – high speed of work. Other algorithms use different complex structure of storage and processing of the information, representing the whole system. One example of such system is the neural network modelling work of the human brain. Despite the huge number of neural networks, most of them are specialized on a specific task. For the most difficult tasks uses a very complex multi-layer neural network. But the difficulty is often justify the cost – effectiveness of such networks is much higher than conventional methods. One of the most suitable for the task of recognition of persons recognized as neural network type neocognitron.

1. Theme urgency

The task of face recognition is becoming increasingly relevant. There are many places where you can apply recognition violence. The most famous places of use are security and forensics. But there are other applications. For example, in social network this technology can be applied automatically highlight and signatures of the people in photos. Also this technology is extremely important in robotics and military work, as it will allow to divide people into «your» – «alien» or be divided into classes according to the level of access[1].

2. Goal and tasks of the research

The aim of the study is to develop a software product that will be able with high precision to distinguish persons with real-time mode.

The main research tasks:

  1. Analysis face detection methods in the photo.
  2. Select the most effective method of face detection.
  3. Analysis of algorithms of face recognition.
  4. Selects the most effective algorithm of face recognition
  5. Build the logical model neocognitron.
  6. Software Development model neocognitron.

Research Object: neural network recognition system violence.

Subject research: face detection and face recognition algorithms.

In the framework of master's work it is planned to receive topical scientific results in the following areas:

  1. The assessment of quality of face detection.
  2. Identify the most effective model of face detection.
  3. Quality assessment face recognition using neural networks.
  4. Create a neural network model type neocognitron.
  5. The analysis of the efficiency of the face detection and face recognition algorithms.

3. Overview of research and development

The questions of recognition of persons is engaged in a number of scientists from all over the world. The problem of providing individuals decide scientists from different countries using a variety of methods. The most effective is the use of the Viola-Jones algorithm. The question of facial recognition using neural networks mainly in Japan.

3.1 Overview international sources

Back in 1998 in the work Papageorgiou was described the use of the Haar wavelets for the task selection of persons [2]. In 2001, P. Viola and M. Jones has published several articles, among which were [3] and [4]devoted to the modification of this idea. More sophisticated features helps to increase efficiency. However, in 1999 published an article [5]that describes the basic principles of adaptive learning. This approach is also used in the Viola-Jones algorithm. Because it gives good results, after 2001, there are a lot of articles that describe how the basic principles of the algorithm and implementation [6] and even modified for different tasks, including to capture emotions and movements.

The neural network type neocognitron grew out of the type of networks kognitron. One of the first works [7] appeared in 1980 Later the idea was developed and applied for recognition of different types of properties, including handwriting [8] and people faces [9]

3.2 Overview of local sources

In the Donetsk National Technical University of face detection problems involved several people. The main contribution makes Associate Professor of Applied Mathematics and Computer Fedyaev O.I. in such works as [ 10 - 13 ]. We should also mention the work of assistant professor Ladyzhensky Y.V., including work [ 14 ]

4. The face dtection using Viola-Jones algorithm

As mentioned above, there are two approaches to the problem of face detection of the person in the photo. The Viola-Jones algorithm implements the second approach, abstracting from the task face detection up to the task detection of any object in the picture. Moreover, this algorithm applies to learning algorithms, i.e. it requires some time before applying. The main idea of the algorithms of the second type – pixels and their location characterize the object in the image. In order to accelerate the process of detection the object in the Viola-Jones algorithm use not individual pixels – it use sets of pixels. The main feature of the data sets is that they are split into two types: light and dark. Each such set is called a sign or function Haar. It has the characteristic is a value. It is exactly this feature determines whether the person in the photo. This is how it works: in order to determine whether the person in the picture, sign repeatedly applied to the image. At laying on the characteristic part of the pixels falls under the bright region, part dark. Each region is calculated average brightness value. It is necessary to clarify that the Viola-Jones algorithm uses as input image picture in shades of gray. As soon as we calculated the average brightness of dark and bright areas, from the middle of the bright region is subtracted value of dark area. This will be the Haar sign value. In order to determine if the area under the sign of a person or not, you need to compare the value of the variable with the Haar sign value, obtained during the training phase. But because the symptoms may have a different value depending on the location, then you must check each character and even when turning on different angle. Classical scheme with Haar signs implies rotate 90 and 180 degrees. The Viola-Jones algorithm admits turn to other corners. Also implemented other conversions on the signs, including scaling. But here there is quite a logical question is how to react to different size pictures? This algorithm is used method floating window – instead apply the signs to the entire image, the signs are only applied to a small area called the window. The window can be resized, and the movement of the window allows you to cover the whole image. Picture 1 shows the general scheme of the Viola-Jones algorithm.

Diagram of the Viola-Jones algorithm

Picture 1 – The scheme of the Viola-Jones algorithm
(animation: 23 frame, 7 cycles of recurrence, 81,5 KB)

It is obvious that all the above transformations require calculations of immense complexity. To accelerate uses a different view type image – integral. Integral representation is a representation of an image in the form of a matrix, where each element is the sum of the brightness all the pixels to the left and above the current, i.e. the element x,y is stored amount of brightness of all pixels in the field 0,0 – x,0 – x,y – 0,y. This reduces the time for calculation of the Haar sign value to a constant value.

Let's return to training. This is the most time-consuming procedure as it is necessary to use all of the features set to a variety of images to find the optimal values of the attributes. The quality of training depends on the efficiency of discernment, so needs a lot of images. In training, we need to identify features that will give the correct result very often and those that will generate the random result. The first is called the strong signs, and the second is weak. Weak signs ineffective for recognition. But there is a possibility to use them – they can be combined for the learning of a strong sign. This modification is called AdaBoost.

in summary, we can say that the main concept of Viola-Jones algorithm – quick detection, but slow learning allows the application to run in real-time.

Neural network type neocognitron

Neocognitron was developed by Kunihiko Fukusima in 1980. This neural network is a successor of the idea embodied in kognitron, developed by the same author in 1975. The main feature of neocognitron in comparison with its predecessor – is more consistent with the model of the visual system.

Neocognitron is a multilayer convolutional neural network. In neocognitron uses two types of neurons: simple (S) and complex (C). The task is simple neuron – to monitor their receptionin field and to define for him the way that he taught. Under receptionin field should understand the area in which the neuron is responsible for pattern recognition. Simple neurons are organized into groups called layers. Inside one layer neurons configured to discover the same image, but in different receptive fields. At that weight within one layer is the same. All neurons of one layer sorting out various provisions of the image. Such approach allows to take into account any provision of the image.

Complex neurons are called to follow all simple neurons that can recognize a specific image. To bring such a neuron enough excitement of any controlled simple neuron. Activity simple neuron means that he found the characteristic features of the pattern in a specific position and activity of complex neuron says that the image was found on this layer without clarify place.

On each layer of neurons less and less. On the last layer remain only the neurons responsible for the image as a whole – one for each recognizable image.

Each layer after the first as an input parameter has a picture of the obtained complex neurons in the previous layer. As you travel through the layers pattern is generalized and is becoming less dependent on the position and some level of transformation. Picture 2 shows an example of recognition of the neocognitron character.

Example of neocognitron recognizing the symbol

Picture 2 – Example of neocognitron recognizing the symbol

Training of neocognitron can be conducted with a teacher, and without a teacher. The latter option makes the system more autonomous. Such training in submission to the neocognitron images, the system begins to highlight the characteristic features of the object. The priority tasks are the most important characteristics. From layer to layer remains a General principle of training – identifies factors typical of many input signals.

Conclusions

The analysis of different algorithms for detection and recognition of individuals led to the selection of two promising algorithms: the Viola-Jones algorithm to detect and neural network type neocognitron for the recognition process. In the course of the research it was found that the neocognitron requires careful configuration on the solution of the problem. At this stage, implemented one of the parts of a future product, namely, the face detection. Next steps involve developing a model for neocognitron and creation of software implementation of this type of neural network, as is the integration of these components into a single software product.

When writing this abstract master's work is not done. Final completion: December 2015. The full text of work and materials on the topic can be obtained from the author or his head after that date.

References

  1. В. Вежневец, А. Дегтярева, Обнаружение и локализация лица на изображении, Компьютерная Графика и Мультимедиа, Сетевой журнал.
  2. C. Papageorgiou, M. Oren and T. Poggio. A General Framework for Object Detection. International Conference on Computer Vision, 1998, C. 555 – 562.
  3. P. Viola and M. Jones, "Rapid object detection using a boosted cascade of simple features", Computer Vision and Pattern Recognition, 2001, vol.1, C. 511 – 518.
  4. P. Viola, M. Jones, Robust Real-time Object Detection, 2000, 24 c.
  5. Y. Freund, R. E. Schapire, A Short Introduction to Boosting, Journal of Japanese Society for Artificial Intelligence, 1999.
  6. C. de Souza, Haar-feature Object Detection in C#, 2001.
  7. K. Fukushima, "Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position", Biological Cybernetics, 1980. – C. 193-202.
  8. K. Fukushima, N. Wake, Handwritten alphanumeric Character Recognition by the Neocognitron – IEEE Transactions on neural networks, Vol. 2, No 3, 1991. – C. 355-365.
  9. G. Poli, J. H. Saito, J. F. Mari, M. R. Zorzan, Processing Neocognitron of Face Recognition on High Performance Environment Based on GPU with CUDA Architecture – 20th International Symposium on Computer Architecture and High Performance Computing, 2008. – c. 81-88.
  10. О. И. Федяев, Ю. С. Махно, Система распознавания зашумлённых и искажённых графических образов на основе нейронной сети типа неокогнитрон // Одиннадцатая национальная конференция по искусственному интеллекту с международным участием КИИ-2008: Труды конференции. Т. 3. – М .: ЛЕНАНД, 2008. – 464 с.
  11. А. А. Сова, О. И. Федяев, Исследование моделей S- и С нейронов неокогнитрона при обучении и распознавании образов – VII международная научно-техническая конференция студентов, аспирантов и молодых научных работников Информатика и компьютерные технологии – Донецк, ДонНТУ, 2011. – С. 164-168.
  12. Г. Ю. Костецкая, О. И. Федяев, Кодирование изображений человеческих лиц с помощью самоорганизующейся карты Кохонена – V международная научно-техническая конференция студентов, аспирантов и молодых научных работников Информатика и компьютерные технологии – Донецьк, ДонНТУ, 2009. – С. 265-268.
  13. Н. Х. Умяров, О. И. Федяев, Выделение лица на снимке из видеопотока с целью его распознавания – VII международная научно-техническая конференция студентов, аспирантов и молодых научных работников Информатика и компьютерные технологии – Донецьк, ДонНТУ, 2011. – С. 173-177.
  14. А. В. Колесник, Ю. В. Ладыженский, Распределенная программная система распознавания лиц – VI международная научно-техническая конференция студентов, аспирантов и молодых научных работников Информатика и компьютерные технологии – Донецк, ДонНТУ, 2010. – С. 248-251.