Let us consider the tools that will be used when developing applications:
Intershot difference
Calculation of intershot difference is very common method of initial detection of motion, after which, generally speaking, we can say whether there is a stream of personnel movement. However, the footage must be pre-processed before calculating the difference between them.
The algorithm for computing the intershot difference of two shots for the case of processing a color video in RGB format as follows:
- The input of the algorithm receives two video shots, which are two sequences of bytes in format RGB.
- Produced calculation pixel intershot differences.
- For each pixel compute the average of the values of the three component colors.
- The average value is compared with a prescribed threshold. The comparison is formed by a binary mask.
Thus, the output of the algorithm is formed by a binary mask, one element of which correspond to the three color components of the source pixel two shots. Units in the mask are located in areas where, perhaps, there is movement, but at this stage may be false positives individual elements masks incorrectly set to 1. As the two input shots can be used two consecutive shots from the stream, however, possible to use the shots with a long interval, for example, equal to 1-3 shot. The greater this interval, the higher, the sensitivity to small moving objects, which have only a very small shift in one shot and may be truncated, being assimilated to the noise component of the image.
A disadvantage of this method is the fixation of noise equipment when recording data. Noise will inevitably appear for either of the modern cameras, so it must be fought separately.
The advantage of this method is simple and has low computational resources. The method is widely used earlier for the reason that the developer did not have sufficient computing power. Complexity of the algorithm is of the order O (n) and implemented in just one pass which is very important for images of large dimension.
Basic shot
Describing the method of intershot pixel differences with the construction of the mask movement, as the two input shots, we processed any two shots, taken with a short interval. However, the use of this method leaves the possibility to calculate the difference with some scenes, which would contain only the fixed area of the background (base shot). Such an approach would give us a significant increase in the probability of finding any object, as the slowest, and fastest, precisely at the point where he was at this time. Otherwise, this method is referred to the subtraction or segmentation of the background.
The work of this method is completely similar to the algorithm of intershot difference with the only difference: the difference is calculated between the current and the base shot. The big problem here is the way to build a basic frame, as it must have several properties:
- if the shot is a shot of real images, it must uphold the minimum time from the current shot;
- if the base shot is prepared artificially, it must contain a minimum number of moving parts, otherwise the false alarms on objects are inevitable;
- the minimal noise. Before upgrading the base frame it should be filtered.
Mathematical morphology
Mathematical morphology is an image analysis concerning its form. The algorithm, based on the application of this approach, related to the performance over the image series of transformations that change the shape of objects contained in the image. Mathematical morphology is applied in various systems, working with the image processing at the different stages and for the different purposes:
- improving the visual characteristics of the image (brightness, contrast, etc.);
- restoring damaged images, such as restoration of photos;
- detection of circuits;
- noise reduction.
Object selection
Let us define the concept of the object from the viewpoint of the motion detector.
Selecting objects on the image, the human eye works in conjunction with the brain, which already has enough information about the types, shape, line, size, color and other characteristics, typical for different objects with which a human had faced before. So, together with the selecting the objects in the scene, the human brain performs the procedure of recognition, so the human can easily distinguish objects familiar to him. The work on the selecting of objects by a motion detector, based on the algorithm for calculating pixel differences between the shots, begins with an analysis mask movement. The real objects are usually correspond to the pixels (or minzones), which form a coherent groups, so it is logical to identify the object in terms of motion detection as a group of connected pixels in the movement mask. Such an object corresponds to several options:
- Linear dimensions of the minimum rectangle, which can be described by some groups of pixels.
- Coordinates (x, y) the central point of the rectangle. We assume this point of the central point of the object.
- The number of pixels belonging to the group.
- Area of movement mask, which lies inside the rectangle.
- Area of the current frame, which lies inside the rectangle.
Since the detector only detects moving objects, this group of pixels and the corresponding object in the new franes, arrived to detector input will experience shift, so besides the options listed above, we can introduce a few more:
- The vector describing the direction and speed of the object.
- An array containing the coordinates of the object in the previous frame.
- The lifetime of the object, measured in number of frames.
Object selection occurs in consecutive examination of pixels in movement mask. If in the movement mask nonzero value is detected, procedure started, which search for connected pixels in movement mask. Found pixels are marked to avoid re-detection.
Correlation function
The cross-correlation function bitween two images is a function that shows the degree of correlation of two pictures. Usually it is used to calculate the degree of similarity between regions in different frames. This function should give a single maximum only for the case of two identical images. Very often used normalized cross-correlation function. Maximum value of the full coincidence bitween first and second raster rectangles is a unity.
Algorithms that use the correlation function work as follows.
First, they find several values of the function, differently imposing on raster rectangle taken from another raster. Then from set of obtained values choosed maximum value. If the best value is sufficiently close to unity, then the first raster is found precisely the object that hit the «correlated» rectangle taken at the second raster. The degree of proximity correlation function value to unity is the parameter, choosing the value of which we can achieve the optimum ratio of the number of false detection errors and detection in general.
Tracing
The term implies a tracking a moving object and the values of its parameters during the whole period while object present in frame. When object properly traced appear ability to view the trajectory of the object with any level of details (with frame accuracy, up to 3, 5, etc. frames). Possibility of this feature achieved by saving position of object central point in each frame in special array, which is created for this purpose while determining parameters for each of the moving objects. Movement tracing algorithms should not confuse one object with another, lose if object stopped movement for short period of time, or hide behind an obstruction (for example, a man who was hiding behind a tree), and then reappeared.
In the calculation of the correlation function with goal to detect object in the current frame, which is detected previously, used only pixels corresponded to the movement (have non-zero value in movement mask. Otherwise, the formula for calculating the correlation function can get the color values of pixels lying within a rectangular area, but outside the object, and these pixels are not necessarily correlate with each other, so the value of the correlation function in this case is reduced. And if it falls below a selected threshold of similarity, the algorithm may incorrectly assume this rectangular area unlike the raster object with which comparison is made, even if in fact it is not.
Fig.2. Animated image of recognition process (moving cars and people) while working (number of frames - 4; reps - indefinitely; size - 124 KB).
Research and development, planned to meet
- explore various algorithms for recognizing events in the video stream;
- to analyze the effectiveness of different approaches, and to choose the optimal ratio of «speed-quality»;
- use a partition of the video stream into two;
- analyze splitting results and find out benefits from this process.
Conclusions
The results of this work are considered as the review of basic methods and algorithms for recognition of events in the video stream and identify their main disadvantages. Also, a way to improve recognition algorithms by increasing the level of required memory was reviewed.
The main goal of this work is to develope a method, which increases the effectiveness of video processing for the protection of big extent objects.
This is a summary of the unfinished work. The completion is scheduled for December 2010 year. The results can be obtained by writing a letter to the author.
References
ol class="text">Н.С. Байгарова, Ю.А. Бухштаб, Н.Н. Евтеева. Современная технология содержательного поиска в электронных коллекциях изображений // Электронные библиотеки, 2001, Выпуск 4, том 4.
Jain, R. and Gupta, A. Visual Information Retrieval, Communications of the ACM. // 1997, vol. 40, no. 5.
Н.С. Байгарова, Ю.А. Бухштаб. Некоторые принципы организации поиска видеоданных. Программирование, N 3. // Электронные библиотеки, 1999.
Виктор Гаганов, Антон Конушин. Сегментация движущихся объектов в видео [Электронный ресурс] / Graphics & Media lab, — http://cgm.graphicon.ru/obzoryi/segmentatsiya_dvizhuschihsya_obektov_v_video_potoke.html.
Белявцев В. Г., Воскобойников Ю. Е. Локальные адаптивные алгоритмы фильтрации цифровых изображений // Научный вестник НГТУ, 1997, №3.
Путятин Е.П., Аверин С.И. Обработка изображений в робототехнике. // М.: Машиностроение, 1990, 320 с.
А.П. Горелик, В.А. Скрипкин. Методы распознавания. // М.: «Высшая школа». 1989, 216 с.
Павлидис Т. Алгоритмы машинной графики и обработки изображений. Пер с англ. // М.: Радио и Связь, 1986.
Дэвид А. Форсайт, Джин Понс. Компьютерное зрение. Современный подход // 2004, 1 кв.; Вильямс, 928 с.
Гиренко А.В., Ляшенко В.В., Машталир В.П., Путятин Е.П. Методы корреляционного обнаружения объектов. // Харьков: АО «БизнесИнформ», 1996, 112 с.