Compiler Transformation to Generate Hybrid Sparse Computations

被引：0

作者：

Zhang, Huihui ^{[1
]}

Venkat, Anand ^{[2
]}

Hall, Mary ^{[1
]}

机构：

[1] Univ Utah, Salt Lake City, UT 84112 USA

[2] Intel Corp, Parallel Comp Lab, Santa Clara, CA USA

来源：

PROCEEDINGS OF 2016 6TH WORKSHOP ON IRREGULAR APPLICATIONS: ARCHITECTURE AND ALGORITHMS (IA3) | 2016年

关键词：

MATRIX-VECTOR MULTIPLICATION; PERFORMANCE;

D O I：

10.1109/IA3.2016.11

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Applications over sparse matrices and graphs often rely on efficient matrix representations that exploit the nonzero structure of the sparse representation. In some cases, this structure varies within the matrix, e.g., some portions are more dense and others are very sparse. For such matrices, hybrid algorithms are commonly used in sparse linear algebra and graph libraries, which employ multiple representations and computations. Automating such an approach in a compiler is difficult as it depends on analysis of the input matrix, which is only available at runtime. This paper describes compiler and runtime support for generating hybrid implementations. It automatically partitions the input matrix or graph into multiple disjoint subsets, which correspond to significant differences of nonzero structures. These subsets can then be optimized separately. For this purpose, the paper introduces a non-affine split transformation, which automatically generates an inspector and multiple executors. The inspector analyzes and partitions the input matrix according to the split criteria. The resulting executors are further optimized with customized transformations to derive specialized representations. We demonstrate the performance gains on an Nvidia K20c (Kepler) GPU of hybrid implementations for examples from sparse linear algebra and graph analytics: sparse matrix-vector multiplication and stochastic gradient descent.

引用

页码：34 / 41

页数：8

共 50 条

[1] ADVANCED COMPILER OPTIMIZATIONS FOR SPARSE COMPUTATIONS
BIK, AJC
WIJSHOFF, HAG
[J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1995, 31 (01) : 14 - 24
[2] Advanced Compiler Optimizations for Sparse Computations
[J]. J Parallel Distrib Comput, (14):
[3] Compiler Support for Sparse Tensor Computations in MLIR
Bik, Aart
Koanantakool, Penporn
Shpeisman, Tatiana
Vasilache, Nicolas
Zheng, Bixia
Kjolstad, Fredrik
[J]. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2022, 19 (04)
[4] Compiler and runtime support for adaptive sparse computations on a multithreaded architecture
Zoppetti, GM
Agrawal, G
[J]. PARALLEL AND DISTRIBUTED COMPUTING SYSTEMS, 2002, : 488 - 493
[5] Optimizing Sparse Matrix Computations Through Compiler-Assisted Programming
Rietveld, Kristian F. D.
Wijshoff, Harry A. G.
[J]. PROCEEDINGS OF THE ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS (CF'16), 2016, : 100 - 109
[6] ParSy: Inspection and Transformation of Sparse Matrix Computations for Parallelism
Cheshmi, Kazem
Kamil, Shoaib
Strout, Michelle Mills
Dehnavi, Maryam Mehri
[J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE, AND ANALYSIS (SC'18), 2018,
[7] Automatic data structure selection and transformation for sparse matrix computations
Bik, AJC
Wijshoff, HAG
[J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1996, 7 (02) : 109 - 126
[8] USING AN LALR COMPILER COMPILER TO GENERATE INCREMENTAL PARSERS
LARCHEVEQUE, JM
[J]. LECTURE NOTES IN COMPUTER SCIENCE, 1991, 477 : 147 - 164
[9] A sparse parallel hybrid Monte Carlo algorithm for matrix computations
Branford, S
Weihrauch, C
Alexandrov, V
[J]. COMPUTATIONAL SCIENCE - ICCS 2005, PT 3, 2005, 3516 : 743 - 751
[10] Heterogeneous Sparse Matrix Computations on Hybrid GPU/CPU Platforms
Cardellini, Valeria
Fanfarillo, Alessandro
Filippone, Salvatore
[J]. PARALLEL COMPUTING: ACCELERATING COMPUTATIONAL SCIENCE AND ENGINEERING (CSE), 2014, 25 : 203 - 212

← 1 2 3 4 5 →