Subscribe to thousands of academic journals for just $40/month
Read and share the articles you need for your research, all in one place.

Parallel Block Matrix Factorizations on the Shared-Memory Multiprocessor IBM 3090 VF/600J


Sage Publications
Copyright © 1992 by SAGE Publications
Publisher site
See Article on Publisher Site

Preview Only

Expand Tray Hide Tray

Parallel Block Matrix Factorizations on the Shared-Memory Multiprocessor IBM 3090 VF/600J


PARALLEL BLOCK M A T R I X FACTORIZATIONS ON THE SHARED-MEMORY MULTIPROCESSOR IBM 3090 VF/GOOJ Krister Dackland, Erik Elmroth, and Bo K6gstrom INSTITUTE OF INFORMATION PROCESSING UNIVERSITY OF U M a S-90187U M a, SWEDEN Charles Van Loan DEPARTMENT OF COMPUTER SCIENCE CORNELL UNIVERSITY ITHACA, NEW YORK 14853-7501 Summary Efficient parallel block algorithms for the LU factorization with partial pivoting, the Cholesky factorization, and the QR factorization transportable over a range of parallel MlMD architectures are presented. Parallel implementations of different block algorithms that utilize optimized uniprocessor level-3 BIAS are compared with corresponding routines of IAPACK (under development). Parallelism is mainly invoked implicitly in UPACK by replacing calls t o uniprocessor level-3 kernels by calls to parallel level9 kernels and thereby maintaining portability. However, by parallelizing at the block level (explicitly) it is possible t o overlap and Pipeline different matrix-matrix operations and thereby gain some performance. Theoretical models give upPer bounds on the best possible speedup of the explicitly and implicitly parallel block algorithms for the target machine. The International Journal of SupercomputerAppkah-7 Volume 6, No. 1, Spring 1992, pp. 6 M 7 . 1932 Massachusetts Institute of Technology. Introduction \Vith the introduction of advanced parallel
Loading next page...

Preview Only. This article cannot be rented because we do not currently have permission from the publisher.