Design and Performance Analysis of Partial Computation Output Schemes for Accelerating Coded Machine Learning

被引:1
|
作者
Xu, Xinping [1 ,2 ]
Lin, Xiaojun [3 ]
Duan, Lingjie [4 ]
机构
[1] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore
[2] Berkeley Educ Alliance Res Singapore, Singapore 138602, Singapore
[3] Purdue Univ, Sch Elect & Comp Engn, W Lafayette, IN 47907 USA
[4] Singapore Univ Technol & Design, Engn Syst & Design Pillar, Singapore 487372, Singapore
基金
美国国家科学基金会;
关键词
Runtime; Task analysis; Codes; Machine learning; Encoding; Sparse matrices; Servers; Coded machine learning; maximum-distance-separable codes; partial computation outputs; performance bound analysis;
D O I
10.1109/TNSE.2022.3228322
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Coded machine learning is a technique to use codes, such as (n, q)-maximum-distance-separable ((n, q)-MDS) codes, to reduce the negative effect of stragglers by requiring q out of n workers to complete their computation. However, the MDS scheme incurs significant inefficiency in wasting stragglers' unfinished computation and keeping faster workers idle. Accordingly, this paper proposes to fragment each worker's load into small pieces and utilizes all workers' partial computation outputs (PCO) to reduce the overall runtime. While easy-to-implement, the theoretical runtime performance analysis of our PCO scheme is challenging. We present new bounds and asymptotic analysis to prove that our PCO scheme always reduces the overall runtime for any random distribution of workers' speeds, and its performance gain over the MDS scheme can be arbitrarily large under high variability of workers' speeds. Moreover, our analysis shows another advantage: the PCO scheme's performance is robust and insensitive to system parameter variations, while the MDS scheme has to know workers' speeds for carefully optimizing q. Finally, our realistic experiments validate that the PCO scheme reduces the overall runtime from that of the MDS scheme by at least 12.3%, and we implement our PCO scheme for solving a typical machine learning problem of linear regression.
引用
收藏
页码:1119 / 1130
页数:12
相关论文
共 50 条
  • [1] Coded sparse matrix computation schemes that leverage partial stragglers
    Das, Anindya Bijoy
    Ramamoorthy, Aditya
    2021 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2021, : 1570 - 1575
  • [2] Coded Sparse Matrix Computation Schemes That Leverage Partial Stragglers
    Das, Anindya Bijoy
    Ramamoorthy, Aditya
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2022, 68 (06) : 4156 - 4181
  • [3] Accelerating catalysts design by machine learning
    Yu, Haishan
    Jiang, Jun
    SCIENCE BULLETIN, 2020, 65 (19) : 1593 - 1594
  • [4] Accelerating Chip Design with Machine Learning
    Khailany, Brucek
    PROCEEDINGS OF THE 2020 ACM/IEEE 2ND WORKSHOP ON MACHINE LEARNING FOR CAD (MLCAD '20), 2020, : 33 - 33
  • [5] Accelerating Chip Design With Machine Learning
    Khailany, Brucek
    Ren, Haoxing
    Dai, Steve
    Godil, Saad
    Keller, Ben
    Kirby, Robert
    Klinefelter, Alicia
    Venkatesan, Rangharajan
    Zhang, Yanqing
    Catanzaro, Bryan
    Dally, William J.
    IEEE MICRO, 2020, 40 (06) : 23 - 32
  • [6] Latency Analysis of Coded Computation Schemes over Wireless Networks
    Reisizadeh, Amirhossein
    Pedarsani, Ramtin
    2017 55TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2017, : 1256 - 1263
  • [7] Machine learning assisted adaptive LDPC coded system design and analysis
    Xie, Cong
    El-Hajjar, Mohammed
    Ng, Soon Xin
    IET COMMUNICATIONS, 2024, 18 (01) : 1 - 10
  • [8] Accelerating Additive Design With Probabilistic Machine Learning
    Zhang, Yiming
    Karnati, Sreekar
    Nag, Soumya
    Johnson, Neil
    Khan, Genghis
    Ribic, Brandon
    ASCE-ASME JOURNAL OF RISK AND UNCERTAINTY IN ENGINEERING SYSTEMS PART B-MECHANICAL ENGINEERING, 2022, 8 (01):
  • [9] Incentive Mechanism Design for Distributed Coded Machine Learning
    Ding, Ningning
    Fang, Zhixuan
    Duan, Lingjie
    Huang, Jianwei
    IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2021), 2021,
  • [10] Accelerating the design of lattice structures using machine learning
    Gongora, Aldair E.
    Friedman, Caleb
    Newton, Deirdre K.
    Yee, Timothy D.
    Doorenbos, Zachary
    Giera, Brian
    Duoss, Eric B.
    Han, Thomas Y. -J.
    Sullivan, Kyle
    Rodriguez, Jennifer N.
    SCIENTIFIC REPORTS, 2024, 14 (01):