Recognition of human faces with the help of the neocognitron neural network
Introduction
Today there are a lot of neural network paradigms for solving the problem of recognizing the image of human faces. In the process of recognition of different images there are some difficulties associated with pattern recognition, which were subjected by some distortions (noise, offset, rotation, change the image's size). These problems are solved by selecting the appropriate neural network architecture and the way it will be taught. Analyzing many works devoted to this problem, we can conclude that there is no such model which would be totally strong to all four types of distortions. One of the promising methods for recognition of distorted images is the use of special neural structures such as neocognitron. This is due to the particular structure of this type of neural networks that simulates human visual system
Survey of research and developments
Development and improvement of the model neocognitron is at a very early stage, so every step in its improvement is very important from a scientific and practical point of view. Existing at the time of model neocognitron, unfortunately, is a demonstrative accomplice, because computers that exist today do not possess the capacity to be a valuable medium for neocognitron's work. Now there are several methods that allow solving this problem by optimizing the architecture neocognitron [1, 2]. This work is devoted to development of neural network algorithm, with which you can perform a quick and accurate recognition of input images of a human face with different resolution and the degree of distortion. There also will be analyzed the efficiency of neocognitron when you change various settings in the system. In addition, the analysis will influence various factors on the quality of recognition of human face.
The basic model of the neocognitron was introduced by Fukushima in 1980 as an extension сognitron - neural network algorithm that could recognize complex images. Neocognitron showed excellent results in pattern recognition that were shifted in position, noised, distorted in shape. Nevertheless, neocognitron couldn’t cope with the problems of pattern recognition that was rotated at an angle, and the recognition process took a long time [1, 6].
For a long time many scientists improved neocognitron's model, for example, Freeman, Saton, Fukumi and others. There also were invented two models of neocognitron’s learning:
learning with a teacher;
learning without a teacher.
Structure of the neocognitron
Neocognitron has a hierarchical structure focused on modeling the human visual system. It consists of a sequence of processing layers (S-layer and S-layer) that are organized in a hierarchical structure. The input image is sent to the first layer and transmitted through the plane corresponding to the subsequent layers, until it reaches the output layer, which is identified by a recognizable image [1, 2, 7, 8, 12].
The input sensor neocognitron’s layer is a rectangular layer, which consists of light-sensitive cells. Each subsequent layer is composed of groups of neurons. Neurons of one group have the same weight and recognize the same part of the image. These neurons form the so-called S-cells located in the corresponding S-layers. S-cells are used to highlight the properties of the images came from the previous layer [6, 10, 11].
The outputs of S-layer neurons connected to C-layer neurons. It should also be noted that all the cells in a single plane have the input links of the same spatial distribution and only the position of the preceding cells from which their input links come shifted in parallel. This situation is illustrated in Figure 1. Even in the learning process, in which the values of the input links S-cells are different, the variable relationships are always changing according to this restriction [2, 4, 7, 9, 10].
Links in one module that connect S-cells and C-cells are rigid and do not change in the learning process. Links connecting neurons in S-layer with neurons from the previous C-layer are variable and may change in the learning process. S-cell is activated if a certain pattern is found in a certain place in the previous layer. The image on which the cell reacts is being determined in the learning process [1, 3, 6, 10].
Figure 1 – Input connection to the neurons in the corresponding planes
Figure 2 shows the structure of the neocognitron which consists three modules with pairs of S and C layers. It should be noted that the number of the neocognitron’s layers depends on the complexity of the task and the network with a large number of layers is able to recognize more complex images. Such alternating layers, in which the layer S outlines key features of the image and layer C corrects all distortions of images, was the most effective in the synthesis of recognition systems that are invariant to image distortion [5, 8].
Figure 2 – Schematic diagram illustrating the interconnections between layers in the neocognitron
That's why the algorithm of image recognition with the help of the neocognitron looks as in the bellow example:
Figure 3 – The algorithm of image recognition with the help of the neocognitron (animation: weight - 40.5 КBytes, the number of images - 6, the number of repeat - 6, size - 393 х 608)
Cells employed in the neocognitron
All the cells employed in the neocognitron are of analog type: i.e., the input and output signals of cells take non-negative analog values. Each cell has characteristics similar to a biological neuron, if we consider that the output signal of the cell corresponds to the instantaneous frequency of firing actual biological neuron.
In the neocognitron, we use four different kinds of cells, i.e., S-cell, C-cell, Vs-cell and Vc - cells. As a typical example of these cells, we will first discuss the characteristics of the S-cell.
As shown in Figure 3, an S-cell has a lot of input terminals, either excitatory or inhibitory. If the cell receives signals from excitatory input terminals, the output of the cell will increase. On the other hand, a signal from an inhibitory input terminal will suppress the output. Each input terminal has its own interconnecting coefficient whose value is positive. Although the cell has only one output terminal, it can send signals to a number of input terminals of other cells. An S-cell has an inhibitory input, which causes a shunting effect. Let u(1), u(2) ... u(N) be the excitatory inputs and v be the inhibitory input. The output of inhibitory neuron is defined by:
(1)
where: v – the output of inhibitory neuron;
i – area to each complex neurons with wich inhibitory neuron is connected;
bi – The weight of i-th connection from excitatory cell to inhibitory cell;
ui – the output of i-th excitatory cell.
The output w of this S-cell is defined by:
(2)
where a(i) and b represent the excitatory and inhibitory interconnecting coefficients, respectively. The coefficient b you can find with the help of (1) [7, 9, 10].
The function is defined by:
Figure 3 – Input-to-output characteristics of an S-cell
Let е be the sum of all the excitatory inputs weighted with the interconnecting coefficients and h the inhibitory input multiplied by the interconnecting coefficient, i.e.:
(3)
So if we put (2) in (3), we will get this:
(4)
The output neuron of С-layer:
(5)
where si –the output of single neurons;
ui – weight coefficient from simple neuron to complex [3, 10].
For complex layer's neurons we use different activating functions.
Self-organization of the network
Before learning the weights of S-layer get various small positive values and C-layer gets its constant coefficients. Then from the input or Ci-1 layer on the Si layer in any order are given learning fragments of the image. Then from all cells in the Si-layer being chosen that cell which reaction to the current image was the strongest. After learning the weights from all other neurons of current cell accept the weight of the representative neuron. Then a new piece of the image is given and learning continues in the same way. The learning process repeats for all segments of the image.
Changing the respective weights:
where a is a coefficient of learning
Self-organization of the network is finished when all S-layers learned all learning patterns.
References
Дрига К.В. Распознавание зашумлённых и искажённых образов с помощью неокогнитрона, сб. трудов Международной студенческой научно-практической конференции “Информатика и компьютерные технологии”. – Донецк, ДонНТУ, 15 декабря 2005. – с.233-234.
Махно Ю.С. Программный эмулятор нейросети типа неокогнитрон для распознавания графических образов, Научн. тр. Донецкого национального технического университета, серия „Информатика, кибернетика и вычислительная техника” (ИКВТ-2008), выпуск 9(132) – Донецк: ДонНТУ, 2008. – с.265-269.
Уоссермен Ф. Нейрокомпьютерная техника. Теория и практика. – М.: Мир, 1992. – 260.
Fukushima K.: Neocognitron for handwritten digit recognition. Tokyo University of Technology, Tokyo, Japan – 2002. – Р. 161-170.
Freeman J., Skapura D. Neural Networks. Algorithms, Applications and Programming Techniques. // Addison-Wesley. – 1991. – P. 373-392.
Satoh S., Aso H., Miyake S., Kuroiwa J.: Pattern Recognition System with Top-Down Process of Mental Rotation //Proc. of IWANN'99, 1. – 1999. – P. 816-825.
Fukushima K.: Neocognitron capable of incremental learning. Tokyo University of Technology, Tokyo, Japan – 2003. – Р. 37-43.
Fukushima K. Neocognitron: a self-organising neural network for mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics 36, 1980, pp. 193-202.
Fukushima K., Miyake S. Neocognitrot: a new algorithm for pattern recognition tolerant of deformations and shifts in position. NHK Broadcasting Science Reserch Laboratories, 1981, pp. 455-469.
Fukushima K. Analysis of the Process of Visual Pattern Recognition by the Neocognitron, Osaka University, 1988, pp. 413-420.
Fukumi, Omatu S. and Nishikawa Y., Rotation-Invariant Neural Pattern Recognition System Estimating a Rotation Angle, IEEE, Trans., Neural Network, 8, 1997, pp. 568–581.
Shunji Saton, Jousuke Kuroiwa, Hirotomo Aso , Shogo Miyake Recognition of rotated patterns using neocognitron. , IEEE, Trans., Neural Network, 9, 1997, pp. 588–597
Important note
When this abstract was written the magistral project wasn't finished. This project will be finished after December 2011. Full information and text of magistral project can be taken from author and her manager after this date.