Abstract

Content

Introduction
1. Aspects of distributed systems

1.1 General information about load balancing
1.2 Building distributed application steps

2. Balancing methods

2.1 Static balancing
2.2 Dynamic balancing

3. Dynamic balancing and load migration

                3.1 Description of RCL–strategy load migration
                        3.1.1 First level actions
                        3.1.2 Second level actions
                        3.1.3 Random algorithm
                        3.1.4 Communication algorytm
                        3.1.5 Load computation algorithm
                3.2 RCL–algorithm embedding

Conclusion
Refernces

Introduction

Use of distributed computations for solution of complex tasks, secures essential reduction of time expenses. One of critical tasks assigned at creation of parallel computation system, is the balance of computers’ work–loads.

1. Aspects of distributed systems

1.1 General information about load balancing

Load Balancing is applied with a view to optimize parallel computations by means of parallel computing system, it offers uniform load distribution across processors of multi–processor computer system. Once new tasks emerge, the software designated for balancing, takes decision as to which processor (computational node) would perform computation related to the new task. Load balancing also involves partial transfer of computations from most–loaded processors to less–loaded ones [1].

1.2 Building distributed application steps

One of the steps in the process of a parallel program creation, is partitioning. It is meant for splitting an application into tasks to be performed at individual processors. In consequence of this partitioning, a set of tasks is created to be solved in parallel (simultaneously). These tasks can be both independent or inter–related by means of data exchange. Tasks distribution occurs in a separate step that allows for modules distribution obtained at the partitioning step, among processors.

An example of parallel system

Image 1 — An example of parallel system

Distributed application features collection of logic processes that intercommunicate and exchange messages. Logic processes are distributed over various computational nodes and operate in parallel. The logic processes are distributed over computational nodes to the effect that to assure uniform load of computational nodes.

However, at executing a parallel application, a conflict can arise between balanced distribution of objects across processors, and low rates of data exchange between the said processors. Some processors can go idle, while others can be over–loaded, if the intercommunication between them is effected with low rates [3]. On the other hand, expenses for communication can be substantial for a balanced system. That is why the balancing technique must be selected in such a manner to secure uniform loading of computational nodes and optimum data exchange rate between processors.

The implementation of this parallel computing system requires development of algorithms for objects synchronization that function at different nodes of a computational system. And conversely, the efficiency of synchronization algorithms implementation, is governed by the load balancing across computing system nodes.

Processes paralleling on processors

Image 2 — Processes paralleling on processors

Load imbalance can occur due to the reasons as follows:

discontinuity of parallel application structure, i.e., different logic processes require different computational facilities;
non–uniformity of computing system structure, i.e., different computational nodes are of different performance;
discontinuity of intercommunication structure between nodes of a computer network, i.e., communication lines between nodes can possess different characteristics of throughput capacity [4].

As of today, the universal technique to fight load imbalance, is not available.

2. Balancing methods

Balancing techniques are categorized as static and dynamic.

2.1 Static balancing

Static balancing is performed before executing a parallel application. Under distribution of logic processes across processors, the experience of earlier accomplishments (so–called study on historical data), is employed. Though assigning logic processes to computer network nodes, not necessarily brings beneficial effect.

It can be explained by the following:

Computational environment, where the application is executed, can change, viz., one or more computational nodes can fail.
Computational mode, where the parallel application is executed, can be busy with other computations.

2.2 Dynamic balancing

Dynamic balancing foresees load re–distribution across computational nodes during the application run time. For this balancing, software is used to determine:

processors load;
throughput capacity of communication lines between nodes;
frequency of data exchange between logic processes of parallel application, etc.

Dynamic techniques are normally employed, provided the time required for balancing, is far lower than the time for task performance [2]. The dynamic balancing task includes, as a rule, not only load distribution across computational nodes, but also, considering specifics of the computational algorithm, the selection of the optimum number of processors. Load balancing can be exercised by software and hardware, either centralized or de–centralized.

3 Dynamic balancing and load migration

Algorithm of dynamic balancing, determines from which computational node and to which one the task should be transferred during the system run time. This approach enables responds to changes in the computer system status. Nevertheless, dynamic balancing entails further time expenses for acquisition of statistic data on the computational environment status, on data analysis, and decisions taking.

Load transfer is the mechanism, which is employed for task transfer from one processor to another. Balancing and load transfer are used to improve efficiency of the parallel system [8]. Due to non–uniformity of the computational environment, one algorithm can display good performances in one parallel system and bad performances in another.

3.1 Description of RCL–strategy load migration

Below you will find three algorithms for dynamic load transfer:

random algorithm (random, R);
algorithm based on communication (communication, С);
algorithm based on load computation (load, L).

These balancing algorithms will be referred to as RCL–strategy hereafter.

Algorithm of object or process transfer from a computational node, is quite complicated. However, by introduction of certain assumptions, the complexity of this algorithm can be reduced. Assume that:

Distributed system combines several different data types (heterogeneous).
User can interact with a computer at any time on–line.
Number of computational nodes in the network is limited.
Alteration of the computer network is possible.
Delays in data exchange between nodes, are constant
Expenses for load transfer from one processor to another, are proportionate to physical dimensions of this load.

RCL strategy utilizes two–level process of decision taking (centralized and de–centralized approaches), which involves:

Central processor–coordinator takes decision on load transfer,
Each process that sends messages, selects load to be transferred [10].

3.1.1 First level actions

A central coordinator is selected to be capable of decision taking. At the beginning of load transfer activities, all working nodes suspend their work for a while, and each obtains information on the local load in the current time.

The coordinator analyses these data and based on them, determines whether the load transfer from one node to another, is essential. If the load transfer is requisite, the central coordinator performs activity on load re–distribution subject to the nodes operation rate.

Should the load transfer be not relevant, the processor sends messages thereof to other nodes, and every node continues processing its tasks.

3.1.2 Second level actions

If, while on the second level, sending data nodes will receive the signal from central coordinator on the need to migrate the load, they begin the selection process of the load to transfer it to another processor. It uses the RCL strategy. After that begins the transfer process. It includes a package of data and sending them to the sending node and receiving and unpacking on the receiver. At the end of each step of the migration receiver sends a message to all other processors on the transfer procedure and the new location data. Then, each processor continues to solve tasks [6].

Transferring the load involves the migration of objects from one node to another. To select the objects to be transferred to another computer, usually use one of the three algorithms.

3.1.3 Random algorithm

The random algorithm is a random selection of simulated objects of the sending node. Selection continues as long as the number of selected objects will not match the expected number.

The advantages of the random algorithm are: ease of implementation and relatively little time to select items to transfer.

3.1.4 Communication algorithm

The sending node selects the local objects that exchange information with the host most often. This method helps reduce the time to exchange data between two logical processes. However, in this case, each node must have a table of communication in which exchange frequency is noted for each object at the sending node to other nodes, and this table should be updated regularly [5]. Also, to select an object sorting and searching algorithms must be used, but with a large number of objects the time of linear searching and sorting algorithms being run can be significant.

3.1.5 Load computation algorithm

This algorithm is used in order to minimize the number of selected objects to move. During migration, objects are sorted by the simulation parameters of the load (number of events of each type, multiplied by a factor of complexity). First the object with the maximum load is being selected [9].

In comparison with the algorithm based on communication, this algorithm is less time–consuming and, therefore, is preferable.

3.2 RCL–algorithm embedding

The strategy of dynamic load transfer RCL has been implemented in the SPEEDES (Synchronous Parallel Environment for Emulation and Discrete–Event Simulation) in order to improve its performance.

Each algorithm was tested on three sets of input data. The external load was changing. The experimental results showed that the migration can yield good results under different conditions modeling — under various loads [7].

Image 3 — An example of load migration between processors
(animation: 7 cadres, 3 repeat loops, 200 Kb)
(the coordinator is colored in blue, processors that solving tasks — in orange, vacant processors — in green, overloaded processors — in red)

Conclusion

In conclusion we can say that for all three strategies, execution time is reduced if the interval between the procedures for distribution of the load is small.

The frequency of the procedure affects the redistribution of the simulation time significantly.

When comparing the parallel execution of the load transfer and without it appears that the transfer procedure significantly influences the duration of the experiment.

At the time this essay was written master's work was not yet completed. Final completion: December 2012. The full text of the work and materials on the topic can be obtained from the author or scientific researcher after that date.

References

Замятина Е. Б., Ефимов А. Ю., Козлов А. А., Усынин А. С. Оптимизация времени выполнения параллельных вычислений — Пермь, 2009.
Немнюгин С.А., Средства программирования для многопроцессорных вычислительных систем — Санкт–Петербург, 2007.
Миков А.И., Замятина Е.Б., Козлов А.А. Оптимизация параллельных вычислений с применением мультиагентной балансировки — Пермь, 2007.
Load Balancing in Parallel Computers. Электронный ресурс. Режим доступа: http://www.inspirenignite.com/load-balancing-in-parallel-computers/
Бельков Д.В., Алгоритмы балансировки загрузки процессоров параллельной вычислительной системы. Электронный ресурс. Режим доступа: http://ea.donntu.ru:8080/jspui/bitstream/123456789/6282/1/belkov.pdf
Афраймович Л.Г. Модели и методы эффективного использования распределенных вычислительных систем — Нижний Новгород, 2012
Ardhendu Mandal and Subhas Chandra Pal, An Empirical Study and Analysis of the Dynamic Load Balancing Techniques Used in Parallel Computing Systems. Электронный ресурс. Режим доступа http://arxiv.org/ftp/arxiv/papers/1109/1109.1650.pdf
Marc H. Willebeek–LeMair, Anthony P. Reeves, Transactions on parallel and distributed systems, vol.4, # 9 — 1993, c. 979
Балансировка нагрузки в распределенных системах. Электронный ресурс. Режим доступа http://intuit4.intuit.ru/studies/courses/1146/238/lecture/3287?page=1
Принципы построения параллельных вычислительных систем. Электронный ресурс. Режим доступа http://wiki.auditory.ru/Принципы_построения_параллельных_вычислительных_систем

Galyna Volokhova

Faculty of computer science and technologies

Department of applied math and informatics

Specialty Software engineering

Balancing load in parallel computing systems

Research adviser: PhD of Technical Sciences, Tetyana Mikhaylova