Master of Donetsk National Technical University Yuri Makhno

Abstract:

Theme of master's work: "Recognition of violent patterns through neocognitron neural network"

Author:

Preamble

There are many solutions based on neural networks to solve a problem of image recognition. Considerable difficulties at recognition cause the images with some violations (noise, translation, rotation, zoom). This problem solves by a choice of corresponding architecture and a way of training. The analysis of works shows, that there is not such model yet which would be not sensitive to all four kinds of violations.

In this article we have choosen neocognitron model which uses qualitatively new architecture and self-training. This article describes powerful neural network. On the basis of the created program model of neocognitron the analysis of training process and recognition of graphic images have been completed. Results of modelling have shown, that distinctive neocognitron's feature is its high speed of training and recognition.

Model of neocognitron

Neocognitron is the hierarchical neural network consisting a number of layers going one after another and having incomplete (rare enough) communications between layers [1]. The model of neocognitron has been based on human's visual system. In neocognitron nodes of first layer recognize simple elements like lines and corners. At higher levels nodes recognize more difficult and abstract images like circles, triangles and rectangles. Abstraction degree increases until nodes will recognize persons or other difficult forms, for example. Generally nodes at higher levels receive an input signals from lower nodes, react to wider area of a visual field. Results of higher level nodes are less depend on a position and are steadier against violations [2]. The schematic of neocognitron's recognition process is presented on fig. 1.

Figure 1 – Hierarchial structure of neocognitron

Each layer consists of stacks of matrixes. All matrixes of one layer have identical dimension. From a layer to a layer dimension of matrixes decreases or remains former [1,3]. Elements of matrixes are neurons. All neurons in each matrix recognize the same image (fig. 2). That is, each of them is adjusted for one specific entrance image. Each neuron is sensitive to the limited area of the entrance image, named it receptive area.

Figure 2 – Connections between S-neurons and neurons in "visible" area

There are two types of layers in neocognitron: "simple" (S-layers) and "complex" (C-layers) which consist accordingly "simple matrixes" (recognize input information) and "complex matrixes" (generalize the recognized information). Neurons of simple matrix are connected only with some neurons of previous complex matrix. So neurons of complex matrix in following layer are connected with neurons of simple matrix.

The input layer usually is represented as a zero complex layer [1]. Conceptually, complex matrixes are created to generalize results of recognition in simple matrixes. For example, if neurons the first simple matrix recognize a symbol "g", and the second matrix a symbol "g" these results in a complex matrix should be combined since it is the same English letter.

Each neuron in simple and a complex layers doesn't receive signals from all neurons of previous layer, and only from some with which it is connected. Such neurons are named as "visibility area" [1, 3, 4]. Unlike the term "receptive field", term "visibility area" defines neurons of previous layer with which it is connected, and "receptive field" defines that part of the input layer which recognizes this neuron. Central neuron of each matrix always "sees" the central neuron of previous layer and such neuron is a center of his "visibility area" (fig. 2).

There are layers which can compress and can't compress graphic information. Dimension of matrixes in layers with compression is less than dimension of matrixes in previous layer. Various methods can be used to compress graphic information. In the created model for layers with compression the following method is used: the centre of "visibility area" is displaced on 1 position from central neuron, and displaced on 2 positions concerning centre of a matrix of previous layer [1].

References

Fausett L. Fundamentals of Neural Networks: Architectures, Algorithms and Applications. Prentice Hall, New Jersey, 2000.
Уоссермен Ф. Нейрокомпьютерная техника. Теория и практика. – М.: Мир, 1992. – 260 с.
Freeman J., Skapura D. Neural Networks. Algorithms, Applications and Programming Techniques. – Addison-Wesley, 1991. – P. 373-392
Satoh S., Kuroiwa J., Aso H. and Miyake S. Recognition of Hand-written Patterns by Rotation-invariant Neocognitron – Proc. of ICONIP'98, 1998. – P. 295-299.
Satoh S., Kuroiwa J., Aso H. and Miyake S. Pattern Recognition System with Top-Down Process of Mental Rotation – Proc. of IWANN'99, 1999. – P. 816-825.