Master of Donetsk National Technical University Altynpara Eugene

Altynpara Eugene
Faculty: Вычислительной техники и информатики
Speciality: The software of the automated systems
Theme of master's work: Meeting the challenges of large-scale on the cluster
Leader of work: Ph.D. Y.V.Ladyzhensky

Summary on the Master's work
Introduction:

Now it becomes possible to solve the most complicated computing tasks as the science is still in progress. They are solved on supercomputers.

The computing clusters is being developed, which helps to solve the complicated tasks more quickly, using the advantages of the parallel data processing. In connection with this, the research in the sphere of parallel computing is becoming priority direction. The one of the most interesting direction is becoming the processing of the large size of the video and image data on the cluster.

The successful platform for the researches and system engineering on processing of video images is Microsoft Compute Cluster Server 2003. It is the leader among computing clusters under Microsoft Windows and is now strongly enough compatible, ready to integration to computing кластерам with a basis on Linux.

The theme of processing of video images starts to be investigated at present, researches in the given sphere are exit, but still there is a lot of specific targets on processing of video images (how optimum to process great volume of the data, what optimum algorithm on recognition of objects in systems on the basis of computing clusters and others).

It is necessary to develop special algorithms for the decision of these problems, to organise special computer networks (clusters), to develop interfaces on processing and message transfer between knots of clusters. Such researches can bring the invaluable contribution to development of a science and technics.

Theme urgency:

Now efficiency of use the processing of great volumes of video images in the economic and research purposes does not cause doubts, and the spectrum of the problems solved with application of results of processing of video images, continuously extending.

The project purpose – the creation of new, innovative system on processing of great volumes of given (video images) on Microsoft Compute Cluster Server 2003 with the use of library of computer sight OpenCV. The developed program system can serve for processing of the various video data, is applicable to set of actual problems on video processing (supervision over movement of cars, tracking at the airports, the entertaining centres, the analysis of videorecordings of sports matches and other).

Prospective scientific novelty:

Scientific novelty of developed system consists that is at present the first-ever system which the computing cluster of Microsoft Compute Cluster Server 2003 for processing of great volumes of the video data by means of library as computer sight OpenCV is used.

On the basis of the studied materials, search of materials in the Internet it has been revealed that the most part of parallel calculations on processing of video maps are developed from zero that is not always effective both as regards algorithms, and on time spent for development of algorithms.

At present in sphere of recognition of video maps there are not enough researches. It connects with their complexity, and also the labour input in researches.

The perspective environment for implementation of the environment of parallel processing of video information is MS Compute Cluster Server[1]. MS CCS 2003 represents the integrated platform for support of high-efficiency calculations on cluster systems. MS CCS consists of the operating system of Windows Server 2003 and Microsoft Compute Cluster Pack (CCP) a set of interfaces, utilities and a handle infrastructure. The built in resource of programming on MS CCS a cluster is MS Message Passing Interface (MPI) — one of standards of applications programming HPC[2].

For processing of video maps the library of computer sight OpenCV will be used, the library with an open code, the majorities of algorithms are effective, and are well described.

Such combination (Microsoft Compute Cluster Server 2003 and OpenCV) will allow to achieve the big results, and to receive symbiosis from well recommended 2 technologies.

Practical value of results of operation:

The developed software environment will allow quickly and effectively develop applications on processing and the analysis of video maps of the big size on a computing cluster of Microsoft Compute Cluster Server.

Researches on a theme:

The developed software environment includes subcomponents and the developed structure for parallel calculations. The structure includes:

  • For writing a code with usage of technologies of parallel programming the interface from Microsoft – Message Passing Interface (MPI) is used.
  • For processing of video maps and manipulation with the video data the library of computer sight OpenCV is used [3].
  • The environment for development of the program complex of Microsoft Visual Studio 2008.
  • For debugging of applications using MPI it is used built in Microsoft Visual Studio 2008 debugger.

For processing of video maps in a developed software environment library OpenCV is used. It is the library of algorithms of the computer sight, image processing and numerical algorithms of a general purpose.

The main operations for calculations in developed system are - base operations over many-dimensional numerical arrays, base functions 2D schedules, base operations over maps (filtering, geometrical conversions, conversion of color spaces, etc.), image analysis (a choice of distinctive tags, morphology, search of outlines, histograms), the movement analysis, tracing behind objects, detection of objects. The developed system should accept great volumes of the video data on an input; handle them in a parallel way, using library OpenCV methods. Current implementation OpenCV includes about 500 ready methods, from specified above operations for calculations are realized almost all. On a system input some maps, or the data in the form of file video move.

Depending on that as the data will move, it is possible to construct the various parallel architecture of the program system. For the given program system video data are segmented and move on everyone Node, calculations are carried on in a parallel way. Each unit of MS Windows Computer Cluster Server includes library MPI and graphics library OpenCV. The controlling knot is developed for handle of data exchange between knots; he watches each knot, parses, divides and transfers bulks.

In the circuit of a pipeline software environment, the video data are transferred between computing knots as required calculations. The expedition occurs because on all computing knots calculations are always fulfilled. The controlling knot watches congestion of each knot, if necessary given i-м in knot – arrival of this data waits, sends them to i knot.

During development of a software environment the base allowing is turned out to build difficult systems on processing of the video data on Microsoft cluster.

Let's consider system in more details. At start there is an initialization of interface MPI by means of function MPI_Init (&argc, &argv).

Further it is necessary to learn quantity of calculating knots (accessible processors) on a cluster – we learn by means of method MPI_Comm_size (MPI_COMM_WORLD, &iCount) in which first unit it is transferred global communicator interface MPI (communicator by default, contains the information about cluster on which the second parameter is started also its units), - the variable which on an output will receive quantity of knots.

For definition of a current issue of knot method MPI_Comm_rank (MPI_COMM_WORLD, &iRank) is used. It is necessary to remember that knots on a cluster are numbered not in the order of layout in a network, and in a casual order at cluster start. Knowing the number of current knot we can define to what type to concern it in hierarchy of our system. By the development of programming complex the knot with №0 is defined by controlling knot, and all knots about number big 0 are considered as executing knots.

It is necessary to stipulate 2 moments, structure of the controlling knot and structure of the executing knot. In system there is 1 controlling knot and n-1 running knots (n – number of knots making a cluster).

The controlling knot. By operation of all system uniform allocation of load between all units of a computing cluster is very important. Therefore in the beginning in the control block there is a calculation of allocation of load between knots.

Calculation process наргрузки looks like this:
Quantity of computing knots n
Quantity of maps m
Average allocation m/n
Example:
20 computing knots.
Quantity the represented 1000 pieces
Average allocation = 1000/5 = 200 pieces on 1 knot
Example of the given allocation, we can observe on animation lower.
Figure 1 - Scheme of distribution of pressures between 5 knots (number of frames: 6, repetitions: 5, duration of frames: 200ms)
Figure 1 - Scheme of distribution of pressures between 5 knots (number of frames: 6, repetitions: 5, duration of frames: 200ms)

If the quantity of video maps is not multiply the quantity of knots the remained maps are arranged on 1 piece between knots since 0 knots.

After the load analysis operating the knot fulfils reading necessary for information processing (video maps). After it there is a dispatch of the data between executing knots. For this operation the method int MPI_Send (void *buf, int count, MPI_Datatype type, int dest, int tag, MPI_Comm comm), is used where:

buf — the address of the buffer of memory in which the data of the sent message is allocated; count — quantity of data items in the message; type — type of data items of the transferred message; dest — a rank of process to which the message goes; tag — the value-TAG used for identification of the message; comm — коммуникатор in which frames data transfer is fulfilled.

The executing knot. It is based on performance of 5 functions. Execution time calculation, reception of the data, saving of the data on a local hard disk, data read-out and their filtering.

The browse of results :

As a result of the spent researches there were following experimental data - duration of saving and filtering of 1 map. Owing to what it is possible to calculate time characteristics for a computing cluster. We will receive values "ideal", in the registration transfer time of the data on networks, time for sending of the data, various delays is not taken at calculations and at data transfer. We have received 0,7 seconds value - for 1 computer, for several - we multiply by quantity of calculating knots. Thus, we will obtain the theoretical data for developed system.

Dependence of time of calculations on quantity of computing knots at filtering of maps. On the schedule we see linear growth, in a reality of graphics will change the form. It is linked to that that time spent for data transfer on each knot, will be a miscellaneous and at great volumes of the data, transfer time will fluctuate. More exact results will be received by further development and research of system, calculation of characteristics experimentally.

Outputs:

In developed of the system the description language of tasks, splitting of system into subtasks, the screen monitor of handle of map processing, optimization of allocation of load between knots are necessary.

Thanks to the architecture grounded on MS CCS and graphics library OpenCV, the developed environment reduces time expenses for processing of video data and has wide functionality.

By the protection period of masters work operations we can receive theoretical base, practical results on system research, and also the functioning software product ready for start on Microsoft Compute Cluster Server 2003.

References
  1. Microsoft Windows Compute Cluster Server 2003. Руководство рецензента [Публикация Microsoft] – май 2006г. - http://www.microsoft.com
  2. Технологии - MPI: The Message Passing Interface – Parallel.ru
  3. OpenCV библиотека алгоритмов компьютерного зрения - Wikipedia
  4. Seinstra F.J., Koelma D. Accurate Performance Models of Parallel Low Level Image Processing Operations Based on a Simple Abstract Machine /University of Amsterdam - The Netherlands, 2005 (https://eprints.kfupm.edu.sa).
  5. Bradski G.R., Kaehler A. Learning OpenCV - Computer Vision with the OpenCV Library /United States of America Sebastopol, 2008 (http://oreilly.com/), 1-141c.
  6. CS101 «Введение в методы программирования», CS105 «Дискретная математика», CS220 «Архитектура ЭВМ», CS225 «Операционные системы», CS304 «Методы вычислений» (http://www.winhpc.ru/).
  7. Гергель В.П. Теория и практика параллельных вычислений, http://www.intuit.ru/
  8. Nicolescu C., Jonker P. Parallel low-level image processing on a distributed-memory system /Delft University of Technology - The Netherlands, 2000 (www.springerlink.com/).
  9. Nicolescu C., Jonker P. A data and task parallel image processing environment /Delft University of Technology - The Netherlands, 2000 (www.citeseerx.ist.psu.edu/)
  10. Интернет ресурс с технической документацией и поддержкой библиотеки OpenCV http://sourceforge.net/projects/opencvlibrary/