Optimizing Sparse Matrix—Matrix Multiplication for the GPU

Steven Dalton; Luke Olson; Nathan Bell

doi:10.1145/2699470

Dalton, Steven; Olson, Luke; Bell, Nathan

2015-10-26 00:00:00

Optimizing Sparse Matrix--Matrix Multiplication for the GPU STEVEN DALTON and LUKE OLSON, University of Illinois at UrbanaChampaign NATHAN BELL, Google Sparse matrixmatrix multiplication (SpGEMM) is a key operation in numerous areas from information to the physical sciences. Implementing SpGEMM efficiently on throughput-oriented processors, such as the graphics processing unit (GPU), requires the programmer to expose substantial fine-grained parallelism while conserving the limited off-chip memory bandwidth. Balancing these concerns, we decompose the SpGEMM operation into three highly parallel phases: expansion, sorting, and contraction, and introduce a set of complementary bandwidth-saving performance optimizations. Our implementation is fully general and our optimization strategy adaptively processes the SpGEMM workload row-wise to substantially improve performance by decreasing the work complexity and utilizing the memory hierarchy more effectively. Categories and Subject Descriptors: G.4 [Mathematical Software]: Algorithm Design and Analysis General Terms: Algorithms, Performance Additional Key Words and Phrases: Parallel, sparse, GPU, matrixmatrix ACM Reference Format: Steven Dalton, Luke Olson, and Nathan Bell. 2015. Optimizing sparse matrixmatrix multiplication for the GPU. ACM Trans. Math. Softw. 41, 4, Article 25 (October 2015), 20 pages. DOI: http://dx.doi.org/10.1145/2699470 1. INTRODUCTION Operations on sparse data structures abound in all areas of information and physical science. In particular, the sparse

http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png

ACM Transactions on Mathematical Software (TOMS) Association for Computing Machinery

http://www.deepdyve.com/lp/association-for-computing-machinery/optimizing-sparse-matrix-matrix-multiplication-for-the-gpu-6ayk1u1hd0

Optimizing Sparse Matrix—Matrix Multiplication for the GPU

Loading next page...

References

References for this paper are not available at this time. We will be adding them shortly, thank you for your patience.

Publisher: Association for Computing Machinery
ISSN: 0098-3500
DOI: 10.1145/2699470
Publisher site: See Article on Publisher Site

Abstract

Journal

ACM Transactions on Mathematical Software (TOMS) – Association for Computing Machinery

Published: Oct 26, 2015

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Optimizing Sparse Matrix—Matrix Multiplication for the GPU

Optimizing Sparse Matrix—Matrix Multiplication for the GPU

Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

Optimizing Sparse Matrix—Matrix Multiplication for the GPU

Optimizing Sparse Matrix—Matrix Multiplication for the GPU

References

Abstract

Journal

Recommended Articles

There are no references for this article.

Our policy towards the use of cookies