Exploiting Matrix Dependency for Efficient Distributed Matrix Computation

被引:19
|
作者
Yu, Lele [1 ]
Shao, Yingxia [1 ]
Cui, Bin [1 ]
机构
[1] Peking Univ, Sch EECS, Key Lab High Confidence Software Technol MOE, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
matrix computing; dependency analysis; distributed system;
D O I
10.1145/2723372.2723712
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Distributed matrix computation is a popular approach for many large-scale data analysis and machine learning tasks. However existing distributed matrix computation systems generally incur heavy communication cost during the runtime, which degrades the overall performance. In this paper, we propose a novel matrix computation system, named DMac, which exploits the matrix dependencies in matrix programs for efficient matrix computation in the distributed environment. We decompose each matrix program into a sequence of operations, and reveal the matrix dependencies between operations in the program. We next design a dependency-oriented cost model to select an optimal execution strategy for each operation, and generate a communication efficient execution plan for the matrix computation program. To facilitate the matrix computation in distributed systems, we further divide the execution plan into multiple un-interleaved stages which can run in a distributed cluster with efficient local execution strategy on each worker. The DMac system has been implemented on a popular general-purpose data processing framework, Spark. The experimental results demonstrate that our techniques can significantly improve the performance of a wide range of matrix programs.
引用
收藏
页码:93 / 105
页数:13
相关论文
共 50 条
  • [1] Efficient Large Scale Distributed Matrix Computation with Spark
    Gu, Rong
    Tang, Yun
    Wang, Zhaokang
    Wang, Shuai
    Yin, Xusen
    Yuan, Chunfeng
    Huang, Yihua
    [J]. PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2015, : 2327 - 2336
  • [2] Securely Straggler-Exploiting Coded Computation for Distributed Matrix Multiplication
    Yang, Heecheol
    Hong, Sangwoo
    Lee, Jungwoo
    [J]. IEEE ACCESS, 2021, 9 : 167374 - 167388
  • [3] Squeezed Polynomial Codes: Communication-Efficient Coded Computation in Straggler-Exploiting Distributed Matrix Multiplication
    Hong, Sangwoo
    Yang, Heecheol
    Lee, Jungwoo
    [J]. IEEE ACCESS, 2020, 8 : 190516 - 190528
  • [4] Private and Secure Coded Computation in Straggler-Exploiting Distributed Matrix Multiplication
    Yang, Heecheol
    Hong, Sangwoo
    Lee, Jungwoo
    [J]. 2021 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2021, : 2137 - 2142
  • [5] Efficient computation of the matrix cosine
    Sastre, Jorge
    Ibanez, Javier
    Ruiz, Pedro
    Defez, Emilio
    [J]. APPLIED MATHEMATICS AND COMPUTATION, 2013, 219 (14) : 7575 - 7585
  • [6] Efficient Computation of Matrix Chain
    Wang, Xiaodong
    Zhu, Daxin
    Tian, Jun
    [J]. PROCEEDINGS OF THE 2013 8TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION (ICCSE 2013), 2013, : 703 - 707
  • [7] Redundancy Elimination in Distributed Matrix Computation
    Chen, Zihao
    Han, Baokun
    Xu, Chen
    Qian, Weining
    Zhou, Aoying
    [J]. PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (SIGMOD '22), 2022, : 573 - 586
  • [8] Exploiting a matrix identity in the computation of the efficient score test for overdispersion in the Poisson regression model
    Moffatt, PG
    [J]. STATISTICS & PROBABILITY LETTERS, 1997, 32 (01) : 75 - 79
  • [9] Efficient Algorithm for the Computation of the Solution to a Sparse Matrix Equation in Distributed Control Theory
    Pedroso, Leonardo
    Batista, Pedro
    [J]. MATHEMATICS, 2021, 9 (13)
  • [10] Efficient computation of the latent vectors of a matrix
    Samuelson, PA
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1943, 29 : 393 - 397