RUS | UKR | ENG || DonNTU > Masters portal
Магистр ДонНТУ Черкаев Олег Анатольевич

Cherkaev Oleg


Faculty: radiotechnics
Speciality: Technical protection of information
Theme of master's work: Development of video processing methods while protecting the big extent objects
Supervisor: D.P.S., Prof. Stefanenko Pavel


    Main
    Abstract


Abstract



Development of video processing methods while protecting the big extent objects

Introduction
Relevance
Purpose
Scientific novelty
General characteristics of the tasks of pattern recognition and their types
Review of methods
Planned experiments
Conclusion
References

Introduction
Nowadays to protect the object it’s not enough to fence the building with a barbwire, to carry out round-the-clock ligtening and arrange security around the perimeter. Together with protected objects’ security breach methods should be improved ways of protection. To be more specific, the use of video surveillance systems with video analytics elements is warranted. That is recognition of various objects and classification of the current situation at the object. Improvement of image processing and decision-making algorithms for the system is a guarantee of quality protection, and as a consequence, the security of the protected object.

Relevance
This work solves the current problem, which consists in the development of existing ideas about events’ recognition in the video stream and the use of recognition in applied problems. In this work we will further develop a method which consists in tracking the events occurring in the video stream by subtracting the current and base shots.

Purpose
Our purpose is to examine existing methods for solving the problem of detecting events in the video stream, to identify their deficiencies and to propose methods for their elimination. The task of the work is to develop a method without defaults of existing solutions which enhances the effectiveness of video processing.

Expected scientific novelty and practical value
Developed video surveillance system for extended object must recognize dangerous situations with the help of special algorithms for processing video and video analytics in real time and inform about them operator automatically.

Review of researches and developments
An enormous amount of works is devoted to Image Processing Problems. The most famous works in this area are the following: William Pratt «Digital Image Processing», Soifer VA «Computer Image Processing».

There are various scientific and applied computer programs, as well as industrial software packages that solve these problems. It is impossible to describe all of them. So here we can only note that they often have a particularized purpose and only a small part of them can be used to protect objects of great length. In addition, many of them are only semi-automatic, and many of the actions the operator must perform. It can not be used in automatic systems of protection and management. Natural phenomena such as rain, snow, fog, wind, etc. could make a noticeable element of the oscillations in the originally fixed stage.

Algorithms for movement detection and analyzing usually require stability in a broad range of significantly different external environment. In general, the requirements for such algorithms as follows.

  1. Low computational complexity and ability to work in real time.
  2. Stable detection in different time of day, with possible presence of artificial lighting.
  3. Stable operation at any time of year with complex weather conditions.

In general, we note that there is a lot of scientific and applied problems related to the problem of automatic image analysis, which are not yet fully resolved.

General characteristics of the tasks of pattern recognition and their types
Pattern refers to structured description of the object or phenomenon represented by vector signs, each element of which is the numerical value of one of the features that characterize the corresponding object. The overall structure of the recognition system and stages in the process of its development are shown in Fig. 1.

The structure of the recognition system
Figure 1. The structure of the recognition system

The essence of the recognition task is to determine whether the objects under study have a fixed finite set of features, allowing to assign them to a certain class.

Recognition problems have the following characteristics.
  1. This information tasks, consisting of two phases: a) converting source data to a form suitable for recognition, b) proper identification (an indication of origin of the object classes definitions).
  2. These problems can introduce the concept of analogy or similarity of objects and to formulate the notion of proximity of objects as a basis for credit facilities in the same class or different classes.
  3. In these tasks algorithm can operate on a set of precedents, examples whose classification is known and which in the form of formalized descriptions can be presented to recognition algorithm for adjusting to the task in the learning process.
  4. For these problems it is difficult to build formal theory and apply classical mathematical methods (data for accurate mathematical model often is unaccessible or benefit from using models and mathematical methods are not compensate the costs).
    These tasks can operate with «bad» information (information with gaps, heterogeneous, indirect, unclear, ambiguous, probabilistic).
Advisable to allocate the following types of recognition problems.
  1. The task of recognition is the classification of the presented object by its description to one of the given classes (supervised learning).
  2. The task of automatic classification is a partition of objects (situations) in their descriptions of a system of disjoint classes (taxonomy, cluster analysis, learning without a teacher).
  3. The task of selecting an informative feature set for recognition.
  4. The task of bringing the initial data to a form suitable for recognition.
  5. Dynamic detection and dynamic classification — problem #1 and #2 for dynamic objects.
  6. The forecasting problem — it's the problem #5, in which a decision must relate to at some point in the future.


Let us consider the tools that will be used when developing applications:

Intershot difference
Calculation of intershot difference is very common method of initial detection of motion, after which, generally speaking, we can say whether there is a stream of personnel movement. However, the footage must be pre-processed before calculating the difference between them. The algorithm for computing the intershot difference of two shots for the case of processing a color video in RGB format as follows:

  1. The input of the algorithm receives two video shots, which are two sequences of bytes in format RGB.
  2. Produced calculation pixel intershot differences.
  3. For each pixel compute the average of the values of the three component colors.
  4. The average value is compared with a prescribed threshold. The comparison is formed by a binary mask.

Thus, the output of the algorithm is formed by a binary mask, one element of which correspond to the three color components of the source pixel two shots. Units in the mask are located in areas where, perhaps, there is movement, but at this stage may be false positives individual elements masks incorrectly set to 1. As the two input shots can be used two consecutive shots from the stream, however, possible to use the shots with a long interval, for example, equal to 1-3 shot. The greater this interval, the higher, the sensitivity to small moving objects, which have only a very small shift in one shot and may be truncated, being assimilated to the noise component of the image.
A disadvantage of this method is the fixation of noise equipment when recording data. Noise will inevitably appear for either of the modern cameras, so it must be fought separately.
The advantage of this method is simple and has low computational resources. The method is widely used earlier for the reason that the developer did not have sufficient computing power. Complexity of the algorithm is of the order O (n) and implemented in just one pass which is very important for images of large dimension.

Basic shot
Describing the method of intershot pixel differences with the construction of the mask movement, as the two input shots, we processed any two shots, taken with a short interval. However, the use of this method leaves the possibility to calculate the difference with some scenes, which would contain only the fixed area of the background (base shot). Such an approach would give us a significant increase in the probability of finding any object, as the slowest, and fastest, precisely at the point where he was at this time. Otherwise, this method is referred to the subtraction or segmentation of the background.
The work of this method is completely similar to the algorithm of intershot difference with the only difference: the difference is calculated between the current and the base shot. The big problem here is the way to build a basic frame, as it must have several properties:

  • if the shot is a shot of real images, it must uphold the minimum time from the current shot;
  • if the base shot is prepared artificially, it must contain a minimum number of moving parts, otherwise the false alarms on objects are inevitable;
  • the minimal noise. Before upgrading the base frame it should be filtered.

Mathematical morphology
Mathematical morphology is an image analysis concerning its form. The algorithm, based on the application of this approach, related to the performance over the image series of transformations that change the shape of objects contained in the image. Mathematical morphology is applied in various systems, working with the image processing at the different stages and for the different purposes:

  • improving the visual characteristics of the image (brightness, contrast, etc.);
  • restoring damaged images, such as restoration of photos;
  • detection of circuits;
  • noise reduction.

Object selection
Let us define the concept of the object from the viewpoint of the motion detector.
Selecting objects on the image, the human eye works in conjunction with the brain, which already has enough information about the types, shape, line, size, color and other characteristics, typical for different objects with which a human had faced before. So, together with the selecting the objects in the scene, the human brain performs the procedure of recognition, so the human can easily distinguish objects familiar to him. The work on the selecting of objects by a motion detector, based on the algorithm for calculating pixel differences between the shots, begins with an analysis mask movement. The real objects are usually correspond to the pixels (or minzones), which form a coherent groups, so it is logical to identify the object in terms of motion detection as a group of connected pixels in the movement mask. Such an object corresponds to several options:

  1. Linear dimensions of the minimum rectangle, which can be described by some groups of pixels.
  2. Coordinates (x, y) the central point of the rectangle. We assume this point of the central point of the object.
  3. The number of pixels belonging to the group.
  4. Area of movement mask, which lies inside the rectangle.
  5. Area of the current frame, which lies inside the rectangle.

    Since the detector only detects moving objects, this group of pixels and the corresponding object in the new franes, arrived to detector input will experience shift, so besides the options listed above, we can introduce a few more:

  6. The vector describing the direction and speed of the object.
  7. An array containing the coordinates of the object in the previous frame.
  8. The lifetime of the object, measured in number of frames.

Object selection occurs in consecutive examination of pixels in movement mask. If in the movement mask nonzero value is detected, procedure started, which search for connected pixels in movement mask. Found pixels are marked to avoid re-detection.

Correlation function
The cross-correlation function bitween two images is a function that shows the degree of correlation of two pictures. Usually it is used to calculate the degree of similarity between regions in different frames. This function should give a single maximum only for the case of two identical images. Very often used normalized cross-correlation function. Maximum value of the full coincidence bitween first and second raster rectangles is a unity.
Algorithms that use the correlation function work as follows.
First, they find several values of the function, differently imposing on raster rectangle taken from another raster. Then from set of obtained values choosed maximum value. If the best value is sufficiently close to unity, then the first raster is found precisely the object that hit the «correlated» rectangle taken at the second raster. The degree of proximity correlation function value to unity is the parameter, choosing the value of which we can achieve the optimum ratio of the number of false detection errors and detection in general.

Tracing
The term implies a tracking a moving object and the values of its parameters during the whole period while object present in frame. When object properly traced appear ability to view the trajectory of the object with any level of details (with frame accuracy, up to 3, 5, etc. frames). Possibility of this feature achieved by saving position of object central point in each frame in special array, which is created for this purpose while determining parameters for each of the moving objects. Movement tracing algorithms should not confuse one object with another, lose if object stopped movement for short period of time, or hide behind an obstruction (for example, a man who was hiding behind a tree), and then reappeared.
In the calculation of the correlation function with goal to detect object in the current frame, which is detected previously, used only pixels corresponded to the movement (have non-zero value in movement mask. Otherwise, the formula for calculating the correlation function can get the color values of pixels lying within a rectangular area, but outside the object, and these pixels are not necessarily correlate with each other, so the value of the correlation function in this case is reduced. And if it falls below a selected threshold of similarity, the algorithm may incorrectly assume this rectangular area unlike the raster object with which comparison is made, even if in fact it is not.


Fig.2. Animated image of recognition process (moving cars and people) while working (number of frames - 4; reps - indefinitely; size - 124 KB).

Research and development, planned to meet

Conclusions
The results of this work are considered as the review of basic methods and algorithms for recognition of events in the video stream and identify their main disadvantages. Also, a way to improve recognition algorithms by increasing the level of required memory was reviewed.
The main goal of this work is to develope a method, which increases the effectiveness of video processing for the protection of big extent objects.

This is a summary of the unfinished work. The completion is scheduled for December 2010 year. The results can be obtained by writing a letter to the author.

References
ol class="text">

  • Н.С. Байгарова, Ю.А. Бухштаб, Н.Н. Евтеева. Современная технология содержательного поиска в электронных коллекциях изображений // Электронные библиотеки, 2001, Выпуск 4, том 4.
  • Jain, R. and Gupta, A. Visual Information Retrieval, Communications of the ACM. // 1997, vol. 40, no. 5.
  • Н.С. Байгарова, Ю.А. Бухштаб. Некоторые принципы организации поиска видеоданных. Программирование, N 3. // Электронные библиотеки, 1999.
  • Виктор Гаганов, Антон Конушин. Сегментация движущихся объектов в видео [Электронный ресурс] / Graphics & Media lab, — http://cgm.graphicon.ru/obzoryi/segmentatsiya_dvizhuschihsya_obektov_v_video_potoke.html.
  • Белявцев В. Г., Воскобойников Ю. Е. Локальные адаптивные алгоритмы фильтрации цифровых изображений // Научный вестник НГТУ, 1997, №3.
  • Путятин Е.П., Аверин С.И. Обработка изображений в робототехнике. // М.: Машиностроение, 1990, 320 с.
  • А.П. Горелик, В.А. Скрипкин. Методы распознавания. // М.: «Высшая школа». 1989, 216 с.
  • Павлидис Т. Алгоритмы машинной графики и обработки изображений. Пер с англ. // М.: Радио и Связь, 1986.
  • Дэвид А. Форсайт, Джин Понс. Компьютерное зрение. Современный подход // 2004, 1 кв.; Вильямс, 928 с.
  • Гиренко А.В., Ляшенко В.В., Машталир В.П., Путятин Е.П. Методы корреляционного обнаружения объектов. // Харьков: АО «БизнесИнформ», 1996, 112 с.

  •  



    2010 Черкаев Олег ДонНТУ