Access the full text.
Sign up today, get DeepDyve free for 14 days.
F. Petrini, Wu-chun Feng (2001)
IMPROVED RESOURCE UTILIZATION WITH BUFFERED COSCHEDULINGParallel Algorithms and Applications, 16
T. Eicken, D. Culler, S. Goldstein, K. Schauser (1992)
Active Messages: A Mechanism for Integrated Communication and Computation[1992] Proceedings the 19th Annual International Symposium on Computer Architecture
L. Lamport (1979)
How to Make a Multiprocessor Computer That Correctly Executes Multiprocess ProgramsIEEE Transactions on Computers, C-28
(1993)
Cray T3D. System Architecture Overview
R. Gioiosa, J. Sancho, Song Jiang, F. Petrini (2005)
Transparent, Incremental Checkpointing at Kernel Level: a Foundation for Fault Tolerance for Parallel ComputersACM/IEEE SC 2005 Conference (SC'05)
F. Petrini, D. Kerbyson, S. Pakin (2003)
The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8,192 Processors of ASCI QACM/IEEE SC 2003 Conference (SC'03)
R. Brightwell, L. Fisk (2001)
Scalable Parallel Application Launch on Cplant ™ACM/IEEE SC 2001 Conference (SC'01)
Charles Leiserson, Z. Abuhamdeh, David Douglas, C. Feynman, Mahesh Ganmukhi, Jeffrey Hill, W. Hillis, Bradley Kuszmaul, Margaret Pierre, D. Wells, Monica Wong-Chan, Shaw-Wen Yang, R. Zak (1996)
The Network Architecture of the Connection Machine CM-5J. Parallel Distributed Comput., 33
(1999)
Elite Reference Manual
E. Frachtenberg, F. Petrini, Juan Peinador, S. Pakin, S. Coll (2002)
STORM: Lightning-Fast Resource ManagementACM/IEEE SC 2002 Conference (SC'02)
M. Snir, S. Otto, D. Walker, J. Dongarra, S. Huss-Lederman (1996)
MPI: The Complete Reference
D. Culler, R. Karp, D. Patterson, A. Sahay, K. Schauser, E. Santos, R. Subramonian, T. Eicken (1993)
LogP: towards a realistic model of parallel computation
D. Kerbyson, H. Alme, A. Hoisie, F. Petrini, H. Wasserman, M. Gittings (2001)
Predictive Performance and Scalability Modeling of a Large-Scale ApplicationACM/IEEE SC 2001 Conference (SC'01)
F. Petrini, Wu-chun Feng, A. Hoisie, S. Coll, E. Frachtenberg (2002)
The Quadrics Network: High-Performance Clustering TechnologyIEEE Micro, 22
G. Almási, Ralph Bellofatto, J. Brunheroto, Calin Cascaval, J. Castaños, P. Crumley, C. Erway, D. Lieber, X. Martorell, José Moreira, R. Sahoo, A. Sanomiya, L. Ceze, K. Strauss (2003)
An Overview of the Blue Gene/L System Software OrganizationParallel Process. Lett., 13
김성운, 모상만, 권혁제, 김보관 (2001)
InfiniBand 물리 계층 설계
(1999)
Cplant . login : USENIX Magazine
(1992)
NI System Programming
(2002)
How does ASCI actually complete multimonth 1000-processor milestone simulations?
(1999)
Quadrics Supercomputers World Ltd
Jiuxing Liu, A. Mamidala, Abhinav Vishnu, D. Panda (2005)
Evaluating InfiniBand performance with PCI ExpressIEEE Micro, 25
Yang-Suk Kee, S. Ha (2002)
An Efficient Implementation of the BSP Programming Library for VIAParallel Process. Lett., 12
J. Fernandez, E. Frachtenberg, F. Petrini (2003)
BCS-MPI: A New Approach in the System Software Design for Large-Scale Parallel ComputersACM/IEEE SC 2003 Conference (SC'03)
D. Culler, J. Singh, Anoop Gupta (1998)
Parallel computer architecture - a hardware / software approach
J. Sancho, F. Petrini, Greg Johnson, Juan Peinador, E. Frachtenberg (2004)
On the feasibility of incremental checkpointing for scientific computing18th International Parallel and Distributed Processing Symposium, 2004. Proceedings.
G. Bosilca, Aurélien Bouteiller, F. Cappello, Samir Djilali, G. Fedak, C. Germain, T. Hérault, Pierre Lemarinier, O. Lodygensky, F. Magniette, V. Néri, A. Selikhov (2002)
MPICH-V: Toward a Scalable Fault Tolerant MPI for Volatile NodesACM/IEEE SC 2002 Conference (SC'02)
L. Valiant (1990)
A bridging model for parallel computationCommun. ACM, 33
Jonathan Hill, W. Mccoll, D. Stefanescu, M. Goudreau, Kevin Lang, Satish Rao, Torsten Suel, T. Tsantilas, R. Bisseling (1998)
BSPlib: The BSP programming libraryParallel Comput., 24
E. Frachtenberg, F. Petrini, S. Coll, Wu-chun Feng (2001)
Gang scheduling with lightweight user-level communicationProceedings International Conference on Parallel Processing Workshops
D. Feitelson, L. Rudolph (1992)
Gang Scheduling Performance Benefits for Fine-Grain SynchronizationJ. Parallel Distributed Comput., 16
Tomio Kamada, S. Matsuoka, A. Yonezawa (1994)
Efficient parallel global garbage collection on massively parallel computersProceedings of Supercomputing '94
K. Davis, A. Hoisie, Greg Johnson, D. Kerbyson, M. Lang, S. Pakin, F. Petrini (2004)
A Performance and Scalability Analysis of the BlueGene/L ArchitectureProceedings of the ACM/IEEE SC2004 Conference
E. Hendriks (2002)
BProc: the Beowulf distributed process space
A. Hori, H. Tezuka, Y. Ishikawa (1998)
Overhead Analysis of Preemptive Gang Scheduling
N. Adiga, G. Almási, G. Almási, Y. Aridor, R. Barik, D. Beece, Ralph Bellofatto, G. Bhanot, R. Bickford, M. Blumrich, A. Bright, J. Brunheroto, Calin Cascaval, J. Castaños, W. Chan, L. Ceze, P. Coteus, S. Chatterjee, Dong Chen, G. Chiu, T. Cipolla, P. Crumley, K. Desai, A. Deutsch, T. Domany, M. Dombrowa, W. Donath, M. Eleftheriou, C. Erway, J. Esch, B. Fitch, J. Gagliano, A. Gara, R. Garg, R. Germain, M. Giampapa, B. Gopalsamy, John Gunnels, Manish Gupta, F. Gustavson, S. Hall, R. Haring, D. Heidel, P. Heidelberger, L. Herger, D. Hoenicke, R. Jackson, T. Jamal-Eddine, G. Kopcsay, E. Krevat, M. Kurhekar, A. Lanzetta, D. Lieber, L. Liu, M. Lu, M. Mendell, A. Misra, Y. Moatti, L. Mok, J. Moreira, B. Nathanson, M. Newton, M. Ohmacht, A. Oliner, Vinayaka Pandit, R. Pudota, R. Rand, R. Regan, B. Rubin, A. Ruehli, S. Rus, R. Sahoo, A. Sanomiya, E. Schenfeld, M. Sharma, Edi Shmueli, Sarabjeet Singh, Peilin Song, V. Srinivasan, B. Steinmacher-Burow, K. Strauss, C. Surovic, R. Swetz, T. Takken, R. Tremaine, M. Tsao, A. Umamaheshwaran, P. Verma, P. Vranas, T. Ward, M. Wazlowski, W. Barrett, C. Engel, B. Drehmel, B. Hilgart, D. Hill, F. Kasemkhani, D. Krolak, Chun-Tao Li, T. Liebsch, J. Marcella, A. Muff, A. Okomo, M. Rouse, A. Schram, M. Tubbs, G. Ulsh, Charles Wait, J. Wittrup, M. Bae, Kenneth Dockser, L. Kissel, M. Seager, J. Vetter, K. Yates (2002)
An Overview of the BlueGene/L SupercomputerACM/IEEE SC 2002 Conference (SC'02)
S. Scott (1996)
Synchronization and communication in the T3E multiprocessor
S. Fortune, J. Wyllie (1978)
Parallelism in random access machinesProceedings of the tenth annual ACM symposium on Theory of computing
V. Sunderam (1990)
PVM: A Framework for Parallel Distributed ComputingConcurr. Pract. Exp., 2
(1992)
Solution of the first-order form of the 3-D discrete ordinates equation on a massively parallel processor
R. Bhoedjang, Tim Rühl, H. Bal (1998)
Efficient multicast on Myrinet using link-level flow controlProceedings. 1998 International Conference on Parallel Processing (Cat. No.98EX205)
David Petrou, Steven Rodrigues, Amin Vahdat, T. Anderson (1998)
GLUix: a global layer unix for a network of workstationsSoftware: Practice and Experience, 28
F. Petrini, Juan Peinador, E. Frachtenberg, S. Coll (2003)
Scalable collective communication on the ASCI Q machine11th Symposium on High Performance Interconnects, 2003. Proceedings.
Jiuxing Liu, A. Mamidala, D. Panda (2004)
Fast and scalable MPI-level broadcast using InfiniBand's hardware multicast support18th International Parallel and Distributed Processing Symposium, 2004. Proceedings.
H. Franke, P. Pattnaik, L. Rudolph (1996)
Gang scheduling for highly efficient, distributed multiprocessor systemsProceedings of 6th Symposium on the Frontiers of Massively Parallel Computation (Frontiers '96)
Weikuan Yu, Darius Buntinas, R. Graham, D. Panda (2004)
Efficient and scalable barrier over Quadrics and Myrinet with a new NIC-based collective message passing protocol18th International Parallel and Distributed Processing Symposium, 2004. Proceedings.
Scalable management of distributed resources is one of the major challenges when building large-scale clusters for high-performance computing. This task includes transparent fault tolerance, efficient deployment of resources and support for all the needs of parallel applications: parallel I/O, deterministic behavior and responsiveness. These challenges may seem daunting with commodity hardware and operating systems, since they were not designed to support a global, single management view of a large-scale system. In this paper we propose and demonstrate an abstract network interface in the cluster interconnect to facilitate the implementation of a simple yet powerful global operating system. This system, which can be thought of as a coarse-grain SIMD operating system, can allow commodity clusters to grow to thousands of nodes, while still retaining the usability and performance of the single-node workstation.
The Computer Journal – Oxford University Press
Published: May 25, 2006
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.