Load Balancing in Parallel Computers

 Author: Urbanic J.

A parallel compute is a collection of processing elements that communicate and cooperate to solve large problems efficiently. Parallel computers vary in two fundamental architecture facets, (i) Single Instruction Multiple Data (SIMD) Vs Multiple Instruction Multiple Data (MIMD) and (ii) Shared memory Vs Distributed memory. A parallel computer with a logically shared memory system provides a single global address space to all processors, and hence a shared programming paradigm to the users. Such systems ae referred as distributed shared memory (DSM) machines.

 

Load balancing on DSM machines is a challenging task, even though the shared global address space may be used as a common pool for work-loads awaiting as in centralized memory systems. Accessing remote memory banks are very expensive, an appropriate distribution of work-loads across physically distributed memories helps reduce such costly remote access.

 

Creating parallel programs involves first decomposing the overall computation into tasks and then assigning the tasks to the processors, this step is also called as partitioning. The optimization objective for partitioning is to balance the work-load among processors and to minimize the inter process communication needs. The number of processes generated by the partitioning step may not be equal to the processors, thus a processor may be idle or loaded with multiple processes. The primary optimization objective of mapping is to balance the workload of processors and to minimize the inter-processor communication cost. Collectively, the problem of load balancing is to develop partitioning and mapping algorithm for the purpose of achieving their respective optimization objectives.

 

Load balancing algorithms can be broadly categorized as static or dynamic. Static load balancing algorithms distribute the processes to processors at compile time, while dynamic algorithms bind processes to processors at run time. Static load balancing algorithms rely on the estimate execution times of the processes and inter-process communication requirement. It is not satisfactory for parallel programs that are of the dynamic and/or unpredictable kind. Consequently in dynamic load balancing, processes are generated and destroyed without a pattern at run time. A dynamic load balancing algorithm consists of four components, Load Measurement rule, an Information Exchange rule, an Initiation rule and a Load Balancing Operation.

 

Book: Scheduling and Load Balancing in Parallel and Distributed Systems, Editors, Behrooz A. Shirazi, Krishna M. Kavi and Ali R. Hurson