On the performance of parallel factorization of out-of-core matrices

被引:5
|
作者
Caron, E [1 ]
Utard, G [1 ]
机构
[1] Univ Lyon 1, INRIA Rhone Alpes, ENS Lyon,GRAAL Project, CNRS,UMR,LIP Lab, F-69364 Lyon 07, France
关键词
matrix factorization; out-of-core; numerical library (ScaLAPACK);
D O I
10.1016/j.parco.2004.02.002
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper, we present an analytical performance model of the parallel left-right looking out-of-core LU factorization algorithm for cluster-like architectures. We show the accuracy of the performance prediction model for the ScaLAPACK library. We analyze the overhead introduced by the out-of-core part of the algorithm and we outline a limitation which was never seen before: for large problems the algorithm has a poor efficiency. This overhead is divided into an IO part and a communication part. We derive an overlapping scheme and minimum memory requirement to avoid the IO overhead. The new scheme is validated by a prototype implementation in ScaLAPACK. We show the impact of the communication overhead on two-dimensional distributions. Then we show that with similar memory requirements a second overlapping scheme may be implemented to avoid the communication overhead. If the size of the physical main memory is proportional to the matrix order (O(N) bytes), then performance of the out-of-core algorithm is similar to that of the in-core algorithm which requires O(N-2) bytes. This paper demonstrates that there is no memory limitation for the factorization of huge matrices. (C) 2004 Elsevier B.V. All rights reserved.
引用
收藏
页码:357 / 375
页数:19
相关论文
共 50 条
  • [1] Performance prediction and analysis of parallel out-of-core matrix factorization
    Caron, E
    Lazure, D
    Utard, G
    [J]. HIGH PERFORMANCE COMPUTING - HIPC 2000, PROCEEDINGS, 2001, 1970 : 161 - 172
  • [2] Key concepts for parallel out-of-core LU factorization
    Dongarra, JJ
    Hammarling, S
    Walker, DW
    [J]. PARALLEL COMPUTING, 1997, 23 (1-2) : 49 - 70
  • [3] Parallel Out-of-Core computation and updating of the QR factorization
    Gunter, BC
    Van De Geijn, RA
    [J]. ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2005, 31 (01): : 60 - 78
  • [4] Key concepts for parallel out-of-core LU factorization
    Dongarra, JJ
    Hammarling, S
    Walker, DW
    [J]. COMPUTERS & MATHEMATICS WITH APPLICATIONS, 1998, 35 (07) : 13 - 31
  • [5] A Cholesky out-of-core factorization
    Castellanos, J. A.
    Larrazabal, G.
    [J]. MATHEMATICAL AND COMPUTER MODELLING, 2013, 57 (9-10) : 2207 - 2222
  • [6] Performance of Parallel Out-of-Core MoM Accelerated by SSD
    Zhao, Xunwang
    Lin, Zhongchao
    Zhang, Yu
    [J]. 2015 IEEE INTERNATIONAL SYMPOSIUM ON ANTENNAS AND PROPAGATION & USNC/URSI NATIONAL RADIO SCIENCE MEETING, 2015, : 562 - 563
  • [7] The design and implementation of the parallel out-of-core ScaLAPACK LU, QR, and Cholesky factorization routines
    D'Azevedo, E
    Dongarra, J
    [J]. CONCURRENCY-PRACTICE AND EXPERIENCE, 2000, 12 (15): : 1481 - 1493
  • [8] Efficient methods for out-of-core sparse Cholesky factorization
    Rothberg, E
    Schreiber, R
    [J]. SIAM JOURNAL ON SCIENTIFIC COMPUTING, 1999, 21 (01): : 129 - 144
  • [9] Out-of-Core Computation of the QR Factorization on Multi-core Processors
    Marques, Mercedes
    Quintana-Orti, Gregorio
    Quintana-Orti, Enrique S.
    van de Geijn, Robert
    [J]. EURO-PAR 2009: PARALLEL PROCESSING, PROCEEDINGS, 2009, 5704 : 809 - +
  • [10] Irregular and out-of-core parallel computing on clusters
    Brezany, P
    Bubak, M
    Malawski, M
    Zajac, K
    [J]. PARALLEL PROCESSING APPLIED MATHEMATICS, 2002, 2328 : 299 - 306