Direct QR factorizations for tall-and-skinny matrices in MapReduce architectures

被引:0
|
作者
Benson, Austin R. [1 ]
Gleich, David F. [2 ]
Demmel, James [3 ]
机构
[1] Stanford Univ, Inst Computat & Math Engn, Stanford, CA 94305 USA
[2] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
[3] Univ Calif Berkeley, Div Comp Sci, Dept Math, Berkeley, CA 94720 USA
关键词
matrix factorization; QR; SVD; TSQR; MapReduce; Hadoop;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The QR factorization and the SVD are two fundamental matrix decompositions with applications throughout scientific computing and data analysis. For matrices with many more rows than columns, so-called "tall-and-skinny matrices," there is a numerically stable, efficient, communication-avoiding algorithm for computing the QR factorization. It has been used in traditional high performance computing and grid computing environments. For MapReduce environments, existing methods to compute the QR decomposition use a numerically unstable approach that relies on indirectly computing the Q factor. In the best case, these methods require only two passes over the data. In this paper, we describe how to compute a stable tall-and-skinny QR factorization on a MapReduce architecture in only slightly more than 2 passes over the data. We can compute the SVD with only a small change and no difference in performance. We present a performance comparison between our new direct TSQR method, indirect TSQR methods that use the communication-avoiding TSQR algorithm, and a standard unstable implementation for MapReduce (Cholesky QR). We find that our new stable method is competitive with unstable methods for matrices with amodest number of columns. This holds both in a theoretical performance model as well as in an actual implementation.
引用
收藏
页数:9
相关论文
共 28 条
  • [21] AutoTSMM: An Auto-tuning Framework for Building High-Performance Tall-and-Skinny Matrix-Matrix Multiplication on CPUs
    Li, Chendi
    Jia, Haipeng
    Cao, Hang
    Yao, Jianyu
    Shi, Boqian
    Xiang, Chunyang
    Sun, Jinbo
    Lu, Pengqi
    Zhang, Yunquan
    19TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2021), 2021, : 159 - 166
  • [22] Algorithm 782:: Codes for rank-revealing QR factorizations of dense matrices
    Bischof, CH
    Quintana-Ortí, G
    ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 1998, 24 (02): : 254 - 257
  • [23] Exploring Dual-Triangular Structure for Efficient R-Initiated Tall-Skinny QR on GPGPU
    Cheng, Nai-Yun
    Chen, Ming-Syan
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2019, PT II, 2019, 11440 : 578 - 589
  • [24] Sparse direct factorizations through unassembled hyper-matrices
    Bientinesi, Paolo
    Eijkhout, Victor
    Kim, Kyungjoo
    Kurtz, Jason
    van de Geijn, Robert
    COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2010, 199 (9-12) : 430 - 438
  • [25] TC-GVF: Tensor Core GPU-Based Vector Fitting via Accelerated Tall-Skinny QR Solvers
    Kukutla, Vinay
    Achar, Ramachandra
    Lee, Wai-Kong
    IEEE TRANSACTIONS ON COMPONENTS PACKAGING AND MANUFACTURING TECHNOLOGY, 2025, 15 (01): : 54 - 63
  • [26] CholeskyQR2: A Simple and Communication-Avoiding Algorithm for Computing a Tall-Skinny QR Factorization on a Large-Scale Parallel System
    Fukaya, Takeshi
    Nakatsukasa, Yuji
    Yanagisawa, Yuka
    Yamamoto, Yusaku
    2014 5TH WORKSHOP ON LATEST ADVANCES IN SCALABLE ALGORITHMS FOR LARGE-SCALE SYSTEMS (SCALA), 2014, : 31 - 38
  • [27] On evaluating higher-order derivatives of the QR decomposition of tall matrices with full column rank in forward and reverse mode algorithmic differentiation
    Walter, Sebastian F.
    Lehmann, Lutz
    Lamour, Rene
    OPTIMIZATION METHODS & SOFTWARE, 2012, 27 (02): : 391 - 403
  • [28] Scalable Direct-Iterative Hybrid Solver for Sparse Matrices on Multi-Core and Vector Architectures
    Ono, Kenji
    Kato, Toshihiro
    Ohshima, Satoshi
    Nanri, Takeshi
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING IN ASIA-PACIFIC REGION (HPC ASIA 2020), 2020, : 11 - 21