Cluster of computers has become an efficient platform for computational intensive applications. Nowadays, the usage of clusters is mainly based on batch scheduler and Single System Image (SSI). In the former case, the scheduling of the applications is managed by a supervisor "batch" regarding the available resources in the cluster. Whereas in SSI, the application scheduling is handled transparently by the operating system, to give the appearance of SMP.
Since few years, batch scheduling is preferred because of its simplicity of usage, configuration and implementation. Latest contributions in SSI systems showed the abilities of the system in deferent fields and directions. Among these contributions, the load allocation and balancing which is usually handled by an automatic process migration daemon, performed better especially for reducing the application execution capability.
The single system image architecture was developed to provide a unified system view and globalize processor, file system and network. The characteristics of SSI allow user to access system resources transparently irrespective of where they are available (1). The load balancing single system image clusters dominate research work in this environment.
In this study, we will elaborate briefly the types of SSI clusters and concentrate on load balancing type as the main aim. Such concentration leads to illustrate the load balancing strategies and the architectures of implemented systems. We then stress on two important and successful types of implemented systems from the architecture, design, behaviour and work mechanism as a main points of view. From that view, we will provide new ideas especially how to develop and investigate the weakness of the systems.
We have structured the study in the following way. First we clarify the SSI organizations and structures to justify the types of SSI and how it structured. Then, the load balancing and scheduling mechanism has been declared to know the main components of load balancing in SSI. According to these components, we will declare the developments of varies systems to know the evolution of SSI load balancing systems in addition to the current developments and researches. Finally we will clarify the problems in the implemented systems that have to be solved and stressed on in the future researches as well as we declared the future directions of SSI systems.
The SSI organizations and structures: In classical cluster systems like Beowulf, a programmer has to write an explicit program by Message Passing Interface (MPI) or by Parallel Virtual Machine (PVM). However, in contrast to high performance Beowulf cluster, Single System Image (SSI) clusters free the end user from such task. According to (2), Single System Image (SSI) is a property of a cluster system to hide the distributed and heterogeneous nature of the resource around the cluster and to present them to applications and users as a single resource. Single system image can be classified in different ways depending on its abstraction layer (3). The available layers of SSI cluster are:
* Hardware layer
* Middle ware layer
* Application layer
* Operating system layer
In spite of the other types, in the operating system layer, most of the mechanisms are transparent to the user; in other words, the user does not interact with the system and the complexity of its implementation. Therefore, the real benefit of this system is its ease of use; by means, program can use the system resources and availability without modification to the source code. A full SSI can achieve more using OS layer (4) through cooperation among nodes operating system to present same view of the system. However, in practice, it is difficult to combine all characteristics of OS layer together although there is some preliminary work towards such initiative (5). These characteristics are cluster wide system management, cluster wide device management, cluster file system, cluster wide process management and cluster wide load balancing (4). To achieve the main purpose of SSI, the load balancing feature becomes most important to reduce execution time and to gain high performance case. Since the main feature of OS layer SSI is the ease of use and transparency, the dynamic load balancing become the main part of implementation.
With dynamic load balancing, the distribution of the workload among the workstation can change at the run time by using current or recent information of the nodes when making the decision (6). There are two predominant organizations by which dynamic load balancing algorithms are implemented: centralized and decentralized (7). In centralized structure, a central node plays the major part in the process placement decision of the cluster. Whereas in a decentralized structure, the process placement decision can be made by any of the nodes around the cluster. From the system point of view, each node in the cluster manages the algorithm independently. As a result, any node around the cluster can make decisions.
Though dynamic load balancing policies offer a high degree of adjustment to the fluctuated load, they still suffer from imbalances. That is because when the task is assigned to the execution site by the load balancing algorithm, it will not change through its life.
Pre-emptive load balancing is an improvement of the dynamic policy. The difference is that the decision of the load placing and scheduling is made during runtime continuously. As a result, there is repeated decision of the system scheduling. In this way, a task may begin its execution at its original site and, due to load fluctuation, be reassigned to another site. Such assignment is accomplished by process migration mechanism. It appears from (8) that the benefits of preemptive load balancing may cause it to be used extensively in distributed systems.
Scheduling and load balancing mechanisms in SSI: In a cluster of workstations, the main component of the load balancing SSI implementation is the scheduler mechanism. When a given workload is applied on any cluster's node, this given load can be efficiently executed if the available resources are efficiently used. So that, there …