VECTORIZATION OF A THREAD-PARALLEL JACOBI SINGULAR VALUE DECOMPOSITION METHOD

被引:2
|
作者
Novakovic, Vedran
机构
[1] Zagreb
来源
SIAM JOURNAL ON SCIENTIFIC COMPUTING | 2023年 / 45卷 / 03期
关键词
batched eigendecomposition of Hermitian matrices of order two; SIMD vectorization; singular value decomposition; parallel one-sided Jacobi-type SVD method; SVD ALGORITHM; ORTHOGONAL EIGENVECTORS; ACCURATE; QR;
D O I
10.1137/22M1478847
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
The eigenvalue decomposition (EVD) of (a batch of) Hermitian matrices of order two has a role in many numerical algorithms, of which the one-sided Jacobi method for the singular value decomposition (SVD) is the prime example. In this paper the batched EVD is vectorized with a vector-friendly data layout and the AVX-512 SIMD instructions of Intel CPUs, alongside other key components of a real and a complex OpenMP-parallel Jacobi-type SVD method, inspired by the sequential xGESVJ routines from LAPACK. These vectorized building blocks should be portable to other platforms that support similar vector operations. Unconditional numerical reproducibility is guaranteed for the batched EVD, sequential or threaded, and for the column transformations, which are, like the scaled dot-products, presently sequential but can be threaded if nested parallelism is desired. No avoidable overflow of the results can occur with the proposed EVD or the whole SVD. The measured accuracy of the proposed EVD often surpasses that of the xLAEV2 routines from LAPACK. While the batched EVD outperforms the matching sequence of xLAEV2 calls, speedup of the parallel SVD is modest but can be improved and is already beneficial with enough threads. Regardless of their number, the proposed SVD method gives identical results but of a somewhat lower accuracy than xGESVJ.
引用
收藏
页码:C73 / C100
页数:28
相关论文
共 50 条
  • [41] A parallel algorithm for singular value decomposition as applied to failure tolerant manipulators
    Braun, TD
    Maciejewski, AA
    Siegel, HJ
    IPPS/SPDP 1999: 13TH INTERNATIONAL PARALLEL PROCESSING SYMPOSIUM & 10TH SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING, PROCEEDINGS, 1999, : 343 - 349
  • [42] Multithreaded Geant4: Semi-automatic Transformation into Scalable Thread-Parallel Software
    Dong, Xin
    Cooperman, Gene
    Apostolakis, John
    EURO-PAR 2010 - PARALLEL PROCESSING, PART II, 2010, 6272 : 287 - +
  • [43] Parallel Approaches for Singular Value Decomposition as Applied to Robotic Manipulator Jacobians
    Tracy D. Braun
    Renard Ulrey
    Anthony A. Maciejewski
    Howard Jay Siegel
    International Journal of Parallel Programming, 2002, 30 : 1 - 35
  • [44] A Parallel Implementation of Singular Value Decomposition based on Map-Reduce and PARPACK
    Ding, Yaguang
    Zhu, Guofeng
    Cui, Chenyang
    Zhou, Jian
    Tao, Liang
    2011 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT), VOLS 1-4, 2012, : 739 - 741
  • [45] Estimation of Thermal Networks Using Singular Value Decomposition Method
    Saidi, A.
    Magnusson, P.
    Sunden, B.
    EXPERIMENTAL HEAT TRANSFER, 2009, 22 (01) : 39 - 57
  • [46] A noisy speech recognition method based on singular value decomposition
    Xu, J.
    Wei, G.
    Leung, S.
    Huanan Ligong Daxue Xuebao/Journal of South China University of Technology (Natural Science), 2001, 29 (01): : 91 - 93
  • [47] Incremental Singular Value Decomposition Using Extended Power Method
    Gupta, Sharad
    Sanyal, Sudip
    COMPUTER AND INFORMATION SCIENCE (ICIS 2018), 2019, 791 : 87 - 105
  • [48] Parallel singular value decomposition of complex matrices using multidimensional CORDIC algorithms
    Hsiao, SF
    Delosme, JM
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1996, 44 (03) : 685 - 697
  • [49] Singular value decomposition control of electro-hydraulically driven parallel robot
    Yang, Chifu
    Huang, Qitao
    Han, Junwei
    FRONTIERS OF MANUFACTURING AND DESIGN SCIENCE II, PTS 1-6, 2012, 121-126 : 90 - 94