Cluster Computing Review
Table of Cluster Management Software Packages Reviewed
Preface
Chapter 1
1. Cluster Computing Review
1.1 Summary of Conclusions
1.2 Introduction
1.3 Organisation of this Review Document
1.4 Comments about this Review
1.5 Cluster Software and Its Interaction With the Operating System
1.6 Some Words About Cluster Computing
1.7 The Workings of Typical Cluster Management Software
1.8 Clusters of Workstations: The Ownership Hurdle.
Chapter 2
2. Evaluation Criteria
2.1 Introduction
2.2 Computing Environments Supported
2.2.1 COMMERCIAL/RESEARCH
2.2.2 HETEROGENEOUS
2.2.3 PLATFORMS
2.2.4 OPERATING SYSTEMS
2.2.5 ADDITIONAL HARDWARE/SOFTWARE
2.3 Application support
2.3.1 BATCH JOBS
2.3.2 INTERACTIVE SUPPORT
2.3.3 PARALLEL SUPPORT
2.3.4 QUEUE TYPE
2.4 Job Scheduling and Allocation Policy
2.4.1 DISPATCHING POLICY
2.4.2 IMPACT ON WORKSTATION OWNER
2.4.3 IMPACT ON THE WORKSTATION
2.4.4 LOAD BALANCING
2.4.5 CHECK POINTING
2.4.6 PROCESS MIGRATION
2.4.7 JOB MONITORING AND RESCHEDULING
2.4.8 SUSPENSION/RESUMPTION OF JOBS
2.5 Configurability
2.5.1 RESOURCE ADMINISTRATION
2.5.2 JOB RUNTIME LIMITS
2.5.3 FORKED CHILD MANAGEMENT
2.5.4 PROCESS MANAGEMENT
2.5.5 JOB SCHEDULING CONTROL
2.5.6 GUI/COMMAND-LINE
2.5.7 EASE OF USE
2.5.8 USER ALLOCATION OF JOBS
2.5.9 USER JOB STATUS QUERY
2.5.10 JOB STATISTICS
2.6 Dynamics of Resources
2.6.1 RUNTIME CONFIGURATION
2.6.2 DYNAMIC RESOURCE POOL
2.6.3 SINGLE POINT OF FAILURE (SPF)
2.6.4 FAULT TOLERANCE
2.6.5 SECURITY ISSUES
Chapter 3
3. Cluster Management Software
3.1.1 INTRODUCTION
3.2 Commercial Packages
3.2.1 CODIINE
3.2.2 CONNECT:QUEUE
3.2.3 LOAD BALANCER
3.2.4 LOADLEVELER
3.2.5 LOAD SHARING FACILITY (LSF)
3.2.6 NETWORK QUEUING ENVIRONMENT (NQE)
3.2.7 TASK BROKER
3.3 Research Packages
3.3.1 BATCH
3.3.2 COMPUTING CENTRE SOFTWARE (CCS)
3.3.3 CONDOR
3.3.4 DISTRIBUTED JOB MANAGER (DJM)
3.3.5 DISTRIBUTED QUEUING SYSTEM (DQS 3.X)
3.3.6 EXTENSIBLE ARGONNE SCHEDULER SYSTEM (EASY)
3.3.7 FAR - A TOOL FOR EXPLOITING SPARE WORKSTATION CAPACITY
3.3.8 GENERIC NETWORK QUEUING SYSTEM (GNQS)
3.3.9 MULTIPLE DEVICE QUEUING SYSTEM (MDQS)
3.3.10 PORTABLE BATCH SYSTEM (PBS)
3.3.11 THE PROSPERO RESOURCE MANAGER (PRM)
3.3.12 QBATCH
Chapter 4
4. Assessment of Evaluation Criteria
4.1 Commercial Packages
4.1.1 CODINE
4.1.2 CONNECT:QUEUE
4.1.3 LOAD BALANCER
4.1.4 LOADLEVELER
4.1.5 LSF
4.1.6 NQE
4.1.7 TASK BROKER
4.2 Research Packages
4.2.1 BATCH
4.2.2 COMPUTING CENTER SOFTWARE
4.2.3 CONDOR
4.2.4 DJM
4.2.5 DQS 3.X - SUPERSEDES DQS 2.X AND DNQS
4.2.6 EASY
4.2.7 FAR
4.2.8 GENERIC NQS
4.2.9 MDQS
4.2.10 PORTABLE BATCH SYSTEM (PBS)
4.2.11 PRM
4.2.12 QBATCH
4.3 A Discussion About the CMS Packages Reviewed
4.3.1 INTRODUCTION
4.3.2 COMMERCIAL PACKAGES
4.3.3 RESEARCH PACKAGES
4.3.4 OVERALL CHOICE
Chapter 5
5. Cluster Computing Environments
5.1 Amoeba
5.2 Beowulf
5.3 The Oxford BSP Library
5.4 Distributed Computing Environment (DCE)
5.5 Dome
5.6 GLU Parallel Programming System
5.7 Local area Multicomputing (LAM)
5.8 Networks Of Workstations (NOW)
5.9 Parallel Virtual Machine (PVM)
5.10 The SHRIMP Project
5.11 The THESIS Distributed Operating System Project
5.12 WANE
Chapter 6
6. Conclusions
6.1 General Comments
6.2 Ommissions in the CMS Packages
6.3 A Step-by-Step Guide to Choosing a CMS package
6.4 Some Personal Views
6.5 The Future ?
Chapter 7
7. Glossary of Terms
Chapter 8
8. References
Last updated 21st January 1996 by:
mab@npac.syr.edu