journal article
LitStream Collection
Collins, David E.; George, Alan D.
doi: 10.1177/003754970107700503pmid: N/A
The task of designing and optimizing job scheduling algorithms for heterogeneous computing environments requires the ability to predict scheduling performance. The complexity of heterogeneous scheduling issues re quires that the experiments devised to test a scheduling paradigm to be both flexible and extensive. Experimen tally determined models for the prediction of job execu tion times on both sequential and parallel computing resources are combined with the implementation of a novel scheduling algorithm and a software-in-the-loop (SWIL) simulation. The result is a potent design and analysis approach for job scheduling algorithms and implementations intended for heterogeneous environ ments. This paper develops the concepts, mechanisms, and results of a SWIL design and analysis approach. The merits of this approach are shown in four case stud ies, which determine overhead, scheduling performance, and the impact of preemption and priority policies. These case studies illustrate the contributions of this research in the form of a new parallel job scheduling algorithm for heterogeneous computing and in the novel application of SWIL simulation to the analysis of job scheduling systems.
Yong Meng Teo, ; Ayani, Rassul
doi: 10.1177/003754970107700504pmid: N/A
This paper focuses on an experimental analysis of the perfor mance and scalability of cluster-based web servers. We carry out the comparative studies using two experimental platforms, namely, a hardware testbed consisting of sixteen PCs, and a trace-driven discrete-event simulator. Dispatcher and web server service times used in the simulator are determined by carrying out a set of experiments on the testbed. The simulator is validated against stochastic queuing models and the testbed. Experiments on the testbed are limited by the hardware configu ration, but our complementary approach allows us to carry out scalability studies on the validated simulator. The three dis patcher-based scheduling algorithms analyzed are: round robin scheduling, least connected based scheduling, and least loaded based scheduling. The least loaded algorithm is used as the baseline (upper performance bound) in our analysis and the performance metrics include average waiting time, average re sponse time, and average web server utilization. A synthetic trace generated by the workload generator called SURGE, and a public-domain France Football World Cup 1998 trace are used. We observe that the round robin algorithm performs much worse in comparison with the other two algorithms for low to medium workload. However, as the request arrival rate increases, the performance of the three algorithms converge with the least con nected algorithm approaching the baseline algorithm at a much faster rate than the round robin. The least connected algorithm performs well for medium to high workload. At very low load, the average waiting time is two to six times higher than the baseline algorithm but the absolute value between these two waiting times is very small.
doi: 10.1177/003754970107700505pmid: N/A
Scheduling is an important aspect of next generation wide area networks, as it plays a major role in deter mining the quality of service (QoS) a particular appli cation receives. In this paper, we discuss the main de sign issues related to scheduling and provide a frame work for the analysis of different scheduling algorithms to support the QoS requirements of multi media applications. Detailed simulation experiments, using different network parameters, including switch buffer sizes, inter-arrival rates, service rates, and net work link capacities, were conducted. Our results show that simple scheduling algorithms such as FCFS are not adequate to support real-time QoS when the network load is heavy. The results also show that more complex algorithms may be required to provide an ac ceptable level of service guarantees.
Al-Awwami, Z.H.; Obaidat, M.S.; Al-Mulhem, M.
doi: 10.1177/003754970107700506pmid: N/A
This paper presents a novel deadlock recovery mechanism for fully adaptive routing in wormhole interconnection networks and its performance evaluation using simula tion madeling. ZOMA is an efficient mechanism that takes advantage of the concept of wormhole switching in terms of low hardware resource requirements. The perfor mance of the new mechanism can match that of other more expensive deadlock-recovery mechanisms, while requir ing lower hardware resources that are not on the critical path of the switching process. The proposed mechanism creates a new category of deadlock recovery techniques that we refer to as preemptive as opposed to the existing progressive and regressive categories. Performance evalu ation of the proposed mechanism against other routing algorithms and deadlock-recovery techniques is accom plished using the object-oriented simulation approach. The simulator, WormSim, was written in Java for various reasons. Most importantly are the modularity, hierarchy, extensibility, reusability, and flexibility aspects.
doi: 10.1177/003754970107700507pmid: N/A
We consider distributing soft real-time tasks on a clus ter of multiple homogeneous servers. The question con sidered here is how to assign incoming soft real-time tasks to these servers for better performance, measured by the fraction of tasks that miss their deadlines. In this paper, two architectures are taken into account-cen tralized and distributed. Within the distributed archi tecture, four dispatching policies-round robin, Ber noulli splitting, joining the shortest queue, and chop ping-are analyzed and evaluated under the same condition. In the analysis, an approximate method is proposed and evaluated for the joining the shortest queue policy. The results show that for the distributed architecture, the joining the shortest queue policy per forms the best. The chopping policy previously proposed has its limitation and when workload exceeds a moder ate level, it performs worse than round robin. In addi tion, we investigated the impact of using the earliest deadline first to schedule tasks assigned to the same server, and found it can further improve performance.
doi: 10.1177/003754970107700508pmid: N/A
The aim of our research is to develop a distributed system platform that supports a variety of tasks. Currently, we are implementing Internet applica tions on the system, including a service redirector, firewall, and web applications. These applications have different levels of dependability requirements. Depending on their criticality, a single task may execute on one or more computer nodes. Software-implemented fault-tolerant proto cols are used to detect the disagreement among replicas. A reconfiguration protocol is used to identify faulty nodes according to the fault reports from other fault-tolerant protocols and to reallo cate their tasks to other working nodes. As a part of the project, this work focuses on the implemen tation and simulation of the system and the firewall application. Data transfer throughput of the system is measured and analyzed. The results are being used in supporting our development of the overall system.
Showing 1 to 9 of 9 Articles