New generalized data structures for matrices lead to a variety of high-performance algorithms

被引:0
|
作者
Gustavson, FG [1 ]
机构
[1] IBM Corp, Thomas J Watson Res Ctr, Yorktown Heights, NY 10598 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We describe new data structures for full and packed storage of dense symmetric/triangular arrays that generalize both. Using the new data structures, one is led to several new algorithms that save "half" the storage and outperform the current blocked-based level-3 algorithms in LAPACK. We concentrate on the simplest forms of the new algorithms and show for Cholesky factorization they are a direct generalization of LINPACK. This means that level-3 BLAS's are not required to obtain level-3 performance. The replacement for level-3 BLAS are so-called kernel routines, and on IBM platforms they are producible from simple textbook type codes, by the XLF Fortran compiler. In the sequel I will label these "vanilla" codes. The results for Cholesky, on Power3 with a peak performance of 800 Mflop/s at n greater than or equal to 200 is over 720 MFlop/s and reaches 735 MFlop/s. Using conventional full-format LAPACK DPOTRF with ESSL BLAS's, one first gets 600 MFlop/s at n 600 and only reaches a peak of 620 MFlop/s. We have also produced simple square blocked full-matrix data formats where the blocks themselves are stored in column-major (Fortran) order or row-major (C) format. The simple algorithms of LU factorization with partial pivoting for this new data format is a direct generalization of LINPACK algorithm DGEFA. Again, no conventional level-3 BLAS's are required; the replacements are again so-called kernel routines, Programming far squared blocked full-matrix format can be accomplished in standard Fortran through the use of three- and four-dimensional arrays. Thus, no new compiler support is necessary. Finally we mention that other more complicated algorithms are possible, for example, recursive ones. The recursive algorithms are also easily programmed via the use of tables that address where the blocks are stored in the two-dimensional recursive block array.
引用
收藏
页码:46 / 61
页数:16
相关论文
共 50 条
  • [31] Multilevel Data Processing Using Parallel Algorithms for Analyzing Big Data in High-Performance Computing
    Awais Ahmad
    Anand Paul
    Sadia Din
    M. Mazhar Rathore
    Gyu Sang Choi
    Gwanggil Jeon
    International Journal of Parallel Programming, 2018, 46 : 508 - 527
  • [32] Scalable, High-Performance, and Generalized Subtree Data Anonymization Approach for Apache Spark
    Bazai, Sibghat Ullah
    Jang-Jaccard, Julian
    Alavizadeh, Hooman
    ELECTRONICS, 2021, 10 (05) : 1 - 28
  • [33] High performance algorithms for Toeplitz and block Toeplitz matrices
    Gallivan, KA
    Thirumalai, S
    VanDooren, P
    Vermaut, V
    LINEAR ALGEBRA AND ITS APPLICATIONS, 1996, 243 : 343 - 388
  • [34] High performance algorithms for Toeplitz and block Toeplitz matrices
    Coordinated Science Laboratory, Univ. Illinois at Urbana-Champaign, Urbana, IL, United States
    不详
    Linear Algebra and Its Applications, 241-243 : 343 - 388
  • [35] High-performance clusters of new generation for shooting data processing
    Anufrikova, EV
    Bilan, AP
    Kutov, VP
    Kushnerov, NN
    Yakovlev, AP
    Yudovin, AI
    Iron, L
    Kamps, B
    NEFTYANOE KHOZYAISTVO, 2004, (09): : 118 - 119
  • [36] SWELL DRAWING - A NEW METHOD OF MANUFACTURING HIGH-PERFORMANCE POLYETHYLENE STRUCTURES
    MACKLEY, MR
    SOLBAI, S
    POLYMER, 1987, 28 (07) : 1115 - 1120
  • [37] Amersorb: a new high-performance polymeric separator for lead-acid batteries
    Toniazzo, V
    JOURNAL OF POWER SOURCES, 2005, 144 (02) : 365 - 372
  • [38] High-performance data mining
    IBM, United States
    IBM Data Manag. Mag., 2009, 3
  • [39] High-performance spectral element algorithms and implementations
    Fischer, PF
    Tufo, HM
    PARALLEL COMPUTATIONAL FLUID DYNAMICS: TOWARDS TERAFLOPS, OPTIMIZATION, AND NOVEL FORMULATIONS, 2000, : 17 - 26
  • [40] Numerical algorithms for high-performance computational science
    Dongarra, Jack
    Grigori, Laura
    Higham, Nicholas J.
    PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2020, 378 (2166):