DonNTU   Masters' portal

Abstract

Content

Introduction

Purpose of the work and possible scientific newness: The purpose is to develop simulated software, which will realize the functions of the load balancing subsystem of a distributed parallel simulation environment, will allow to improve friendliness of DPSE to the users and developers of models, to extend the sphere of the use of features of parallel simulation in different problem domains. New methods to paralleling will be developed, the method of construction of models of the difficult dynamic systems in the range of subsystems functions will be developed, reasonability of decomposition of DPSE on subsystems and workability of the simulation software will be experimentally confirmed.

Basic tasks which must be decided:

  1. Analysis of the state of developments of simulation software of a distributed parallel simulation environment.
  2. Development of conceptions of subsystem (requirements, functions, structure of hardware/software).
  3. Development of algorithms, structures, methods, implementation and experimental research in composition with the version of DPSE.

Practical significance of the work: Building a prototype of the subsystem, the use of modeling software in the distributed parallel simulation environment.

1. The main results are available at the time of writing of the abstract

Currently, simulation is widely used in many branches of science and technology. According to the definition in Wikipedia, the simulation is a research of objects of knowledge with help of their models; construction and analysis of models of real objects, processes or phenomena for the purpose to obtain an explanation of these phenomena and to predict the phenomena of researcher's interest. One of the most popular has become the method of computer simulation, which assumes a computer program as a model. The method of creation of computer models is presented in work [4].

Most of the simulated objects are inherently complex. This complexity is transferred to the computer model as the computational complexity of the simulation program. Due to the large computational complexity the run-time of a simulation program can exceed all bounds. To reduce the run-time of simulation program, computer models must be parallelized for parallel execution on a parallel computer. Thus, the parallel computer model (hereinafter the parallel program) consists of a set of logical processes that are ran parallel on a set of processors of parallel computer, interact with each other through exchange (communication) operations and collectively solved the task of modeling.

Balanced loading computational nodes (processors) implies a regular loading of processors by arithmetic operations. However, the balanced loading of processors by arithmetic operations may unbalance the loading on the network (communications chain), which connects processors of parallel computer with each other and provide communication between logical processes. Thus, due to overloading of some lines of communication network, some logical processes may to stand idle most of their time, waiting for data from other processes, without which they can not continue its work.

Imbalance of loading is inevitable for several reasons, among which are the structural heterogeneity of model (different logical processes require different computing power) and the heterogeneity of the structure of a parallel computing machine (different computing nodes may to have different computational power; between the different nodes may be a different bandwidth of communications network; in the process of work of a parallel program can be freed previously occupied computing nodes, which can be used to unload the most loaded nodes or inversely more privileged user can ask for some nodes for the immediate release of their current task in order to use them in his own task; some nodes can be used simultaneously to solve multiple tasks in a time-sharing mode, and the portion of CPU time is allocated by node to solve a current task may also vary in different directions; some computing nodes or communications links may fail).

Thus, the main purpose of load balancing subsystem is to minimize the total time required to solve the task (time of simulation). When to system comes new tasks, load balancing subsystem must to take a decision where (on what compute node) this new task must be executed. In addition, balancing involves the transfer (migration) of the calculations from the most loaded computing nodes to less loaded nodes.

Load balancing is applied in two steps: the step of decomposition of the task into subtasks (processes) and at the step of projection these processes in the computing environment. Decomposition of the task is a step of creation of a parallel program and is intended to separate the task into subtasks. As a result of decomposition of a distributed application there is a set of subtasks, that solved the problem parallel. These subtasks can be independent or connected with each other through the exchange of data. Projection of subtasks in the computing environment is a separate step, which allows you to distribute the subtasks obtained at the step of decomposition among the processors.

pic.1 (GIF, 3 repeats, 4 frames, 500х400, 35,9 Kb)

2. Conclusion

To perform load balancing is necessary to develop special software, which includes software tools that provide assessment of computing environment, controlling program that takes a decision about the time of balancing, and which logical processes should be moved from one processor to another (migration). Now, the search and analysis of capabilities of different libraries to implement migration processes in keeping with architecture and software of cluster DonNTU is conducted.

3. Literature

  1. Абрамов Ф.А., Фельдман Л.П., Святный В.А. Моделирование динамических процессов рудничной аэрологии. Киев, Наукова думка, 1981, 291 с.
  2. Святний В.А. Паралельне моделювання складних динамічних систем // Моделирование – 2006: Международная конференция. Киев, 2006 г. – Киев, 2006. – С. 83–90.
  3. Миков А.И., Замятина Е.Б., Козлов А.А. Оптимизация параллельных вычислений с применением мультиагентной балансировки. // Труды конференции ПАВТ-2009, с. 599-604, Нижний Новгород, Россия, 2009.
  4. Фельдман Л.П., Святный В.А., Рэш М., Цайтц М. Научная область: параллельное моделирование. / Научные работы Донецкого национального технического университета. Серия: Проблемы моделирования и автоматизации проектирования [Электронный ресурс].  — Режим доступа: http://www.nbuv.gov.ua/portal/natural/Npdntu/Pm/2008/08flpfps.pdf
  5. Надеев Д.В. Девиртуализация виртуальных параллельных моделей сложных динамических систем по критериям балансирования загрузки. / Научные работы Донецкого национального технического университета. Серия: Проблемы моделирования и автоматизации проектирования [Электронный ресурс].  — Режим доступа: http://www.nbuv.gov.ua/portal/natural/Npdntu_pm/2008/08ndvolb.pdf
  6. Cвятний В.А., Надєєв Д.В. Підсистема балансування завантаження ресурсів розподіленого паралельного моделюючого середовища. / Наукові праці Донецького національного технічного університету. Серія: Інформатика, кібернетика та обчислювальна техніка. (ИКВТ-02) випуск 39. – Донецьк, ДонНТУ. – 2002. – С. 264-270.
  7. Голуб С.В. Динамическая балансировка загрузки процессоров в многопроцессорных системах [Электронный ресурс] / Портал магистров ДонНТУ.  — Режим доступа: http://masters.donntu.ru/2002/fvti/golub/magwork/index.html
  8. Xiao Qin, Hong Jiangy, Adam Manzanaresz, Xiaojun Ruan, Shu Yinyy Communication-Aware Load Balancing for Parallel Applications on Clusters [Электронный ресурс].  — Режим доступа: http://digitalcommons.unl.edu/cgi/viewcontent.cgi?article=1051&context=csearticles&sei-redir=1
  9. Техническая документация по установке и использованию менеджера распределённых ресурсов для вычислительных кластеров torque [Электронный ресурс].  — Режим доступа: http://www.adaptivecomputing.com/resources/docs/torque/pdf/TORQUE_Administrator's_Guide.pdf
  10. Техническая документация по установке и использованию планировщика задач для вычислительных кластеров maui [Электронный ресурс].  — Режим доступа: http://www.adaptivecomputing.com/resources/docs/maui/pdf/mauiadmin.pdf
  11. Корнилина М.А., Якобовский М.В. Динамическая балансировка загрузки процессоров при моделировании задач горения. // Материалы конференции "Высокопроизводительные вычисления и их приложения", 30 октября - 2 ноября 2000 года, Черноголовка.
  12. Якобовский М.В. Балансировка загрузки процессоров [Электронный ресурс].  — Режим доступа: http://www.software.unn.ac.ru/ccam/files/ipa_f_2_01.pdf
  13. Якобовский М.В. Динамическая балансировка загрузки [Электронный ресурс].  — Режим доступа: http://lira.imamod.ru/lit/msu2009/MSU_5i.ppt