Hypergraph Partitioning Implementation for Parallelizing Matrix-Vector Multiplication Using CUDA GPU-Based Parallel Computing

被引:1
|
作者
Murni [1 ]
Bustamam, A. [2 ]
Ernastuti [1 ]
Handhika, T. [1 ]
Kerami, D. [2 ]
机构
[1] Gunadarma Univ, Computat Math Study Ctr, Depok, Indonesia
[2] Univ Indonesia, Dept Math, Fac Math & Nat Sci FMIPA, Depok 16424, Indonesia
关键词
graph partitioning; hypergraph partitioning; parallelization; matrix-vector; CUDA;
D O I
10.1063/1.4991257
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Calculation of the matrix-vector multiplication in the real-world problems often involves large matrix with arbitrary size. Therefore, parallelization is needed to speed up the calculation process that usually takes a long time. Graph partitioning techniques that have been discussed in the previous studies cannot be used to complete the parallelized calculation of matrix-vector multiplication with arbitrary size. This is due to the assumption of graph partitioning techniques that can only solve the square and symmetric matrix. Hypergraph partitioning techniques will overcome the shortcomings of the graph partitioning technique. This paper addresses the efficient parallelization of matrix-vector multiplication through hypergraph partitioning techniques using CUDA GPU-based parallel computing. CUDA (compute unified device architecture) is a parallel computing platform and programming model that was created by NVIDIA and implemented by the GPU (graphics processing unit).
引用
收藏
页数:6
相关论文
共 50 条
  • [1] MATRIX-VECTOR MULTIPLICATION USING DIGITAL PARTITIONING FOR MORE ACCURATE OPTICAL COMPUTING
    GARY, CK
    [J]. APPLIED OPTICS, 1992, 31 (29): : 6205 - 6211
  • [2] A segment-based sparse matrix-vector multiplication on CUDA
    Feng, Xiaowen
    Jin, Hai
    Zheng, Ran
    Shao, Zhiyuan
    Zhu, Lei
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2014, 26 (01): : 271 - 286
  • [3] Hypergraph-partitioning-based decomposition for parallel sparse-matrix vector multiplication
    Çatalyürek, ÜV
    Aykanat, C
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1999, 10 (07) : 673 - 693
  • [4] Sparse Matrix-Vector Multiplication Optimizations based on Matrix Bandwidth Reduction using NVIDIA CUDA
    Xu, Shiming
    Lin, Hai Xiang
    Xue, Wei
    [J]. PROCEEDINGS OF THE NINTH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS TO BUSINESS, ENGINEERING AND SCIENCE (DCABES 2010), 2010, : 609 - 614
  • [5] A Nested Dissection Partitioning Method for Parallel Sparse Matrix-Vector Multiplication
    Boman, Erik G.
    Wolf, Michael M.
    [J]. 2013 IEEE CONFERENCE ON HIGH PERFORMANCE EXTREME COMPUTING (HPEC), 2013,
  • [6] Analysis of Partitioning Models and Metrics in Parallel Sparse Matrix-Vector Multiplication
    Kaya, Kamer
    Ucar, Bora
    Catalyuerek, Uemit V.
    [J]. PARALLEL PROCESSING AND APPLIED MATHEMATICS (PPAM 2013), PT II, 2014, 8385 : 174 - 184
  • [7] CUDA GPU libraries and novel sparse matrix-vector multiplication - Implementation and performance enhancement in unstructured finite element computations
    Haney R.
    Mohan R.
    [J]. International Journal of Computational Science and Engineering, 2019, 20 (04): : 501 - 507
  • [8] Parallel Sparse Matrix-Vector Multiplication Using Accelerators
    Maeda, Hiroshi
    Takahashi, Daisuke
    [J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2016, PT II, 2016, 9787 : 3 - 18
  • [9] CUDA GPU libraries and novel sparse matrix-vector multiplication-implementation and performance enhancement in unstructured finite element computations
    Haney, Richard
    Mohan, Ram
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2019, 20 (04) : 501 - 507
  • [10] Semi-two-dimensional partitioning for parallel sparse matrix-vector multiplication
    Kayaaslan, Enver
    Ucar, Bora
    Aykanat, Cevdet
    [J]. 2015 IEEE 29TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, 2015, : 1125 - 1134