Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Optimizing Sparse Matrix—Matrix Multiplication for the GPU

Optimizing Sparse Matrix—Matrix Multiplication for the GPU Optimizing Sparse Matrix--Matrix Multiplication for the GPU STEVEN DALTON and LUKE OLSON, University of Illinois at Urbana­Champaign NATHAN BELL, Google Sparse matrix­matrix multiplication (SpGEMM) is a key operation in numerous areas from information to the physical sciences. Implementing SpGEMM efficiently on throughput-oriented processors, such as the graphics processing unit (GPU), requires the programmer to expose substantial fine-grained parallelism while conserving the limited off-chip memory bandwidth. Balancing these concerns, we decompose the SpGEMM operation into three highly parallel phases: expansion, sorting, and contraction, and introduce a set of complementary bandwidth-saving performance optimizations. Our implementation is fully general and our optimization strategy adaptively processes the SpGEMM workload row-wise to substantially improve performance by decreasing the work complexity and utilizing the memory hierarchy more effectively. Categories and Subject Descriptors: G.4 [Mathematical Software]: Algorithm Design and Analysis General Terms: Algorithms, Performance Additional Key Words and Phrases: Parallel, sparse, GPU, matrix­matrix ACM Reference Format: Steven Dalton, Luke Olson, and Nathan Bell. 2015. Optimizing sparse matrix­matrix multiplication for the GPU. ACM Trans. Math. Softw. 41, 4, Article 25 (October 2015), 20 pages. DOI: http://dx.doi.org/10.1145/2699470 1. INTRODUCTION Operations on sparse data structures abound in all areas of information and physical science. In particular, the sparse http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png ACM Transactions on Mathematical Software (TOMS) Association for Computing Machinery

Optimizing Sparse Matrix—Matrix Multiplication for the GPU

Loading next page...
 
/lp/association-for-computing-machinery/optimizing-sparse-matrix-matrix-multiplication-for-the-gpu-6ayk1u1hd0

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Publisher
Association for Computing Machinery
Copyright
Copyright © 2015 by ACM Inc.
ISSN
0098-3500
DOI
10.1145/2699470
Publisher site
See Article on Publisher Site

Abstract

Optimizing Sparse Matrix--Matrix Multiplication for the GPU STEVEN DALTON and LUKE OLSON, University of Illinois at Urbana­Champaign NATHAN BELL, Google Sparse matrix­matrix multiplication (SpGEMM) is a key operation in numerous areas from information to the physical sciences. Implementing SpGEMM efficiently on throughput-oriented processors, such as the graphics processing unit (GPU), requires the programmer to expose substantial fine-grained parallelism while conserving the limited off-chip memory bandwidth. Balancing these concerns, we decompose the SpGEMM operation into three highly parallel phases: expansion, sorting, and contraction, and introduce a set of complementary bandwidth-saving performance optimizations. Our implementation is fully general and our optimization strategy adaptively processes the SpGEMM workload row-wise to substantially improve performance by decreasing the work complexity and utilizing the memory hierarchy more effectively. Categories and Subject Descriptors: G.4 [Mathematical Software]: Algorithm Design and Analysis General Terms: Algorithms, Performance Additional Key Words and Phrases: Parallel, sparse, GPU, matrix­matrix ACM Reference Format: Steven Dalton, Luke Olson, and Nathan Bell. 2015. Optimizing sparse matrix­matrix multiplication for the GPU. ACM Trans. Math. Softw. 41, 4, Article 25 (October 2015), 20 pages. DOI: http://dx.doi.org/10.1145/2699470 1. INTRODUCTION Operations on sparse data structures abound in all areas of information and physical science. In particular, the sparse

Journal

ACM Transactions on Mathematical Software (TOMS)Association for Computing Machinery

Published: Oct 26, 2015

There are no references for this article.