Design and Performance Analysis of Partial Computation Output Schemes for Accelerating Coded Machine Learning

被引：1

作者：

Xu, Xinping ^{[1
,2
]}

Lin, Xiaojun ^{[3
]}

Duan, Lingjie ^{[4
]}

机构：

[1] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore

[2] Berkeley Educ Alliance Res Singapore, Singapore 138602, Singapore

[3] Purdue Univ, Sch Elect & Comp Engn, W Lafayette, IN 47907 USA

[4] Singapore Univ Technol & Design, Engn Syst & Design Pillar, Singapore 487372, Singapore

来源：

IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING | 2023年 / 10卷 / 02期

基金：

美国国家科学基金会;

关键词：

Runtime; Task analysis; Codes; Machine learning; Encoding; Sparse matrices; Servers; Coded machine learning; maximum-distance-separable codes; partial computation outputs; performance bound analysis;

D O I：

10.1109/TNSE.2022.3228322

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

Coded machine learning is a technique to use codes, such as (n, q)-maximum-distance-separable ((n, q)-MDS) codes, to reduce the negative effect of stragglers by requiring q out of n workers to complete their computation. However, the MDS scheme incurs significant inefficiency in wasting stragglers' unfinished computation and keeping faster workers idle. Accordingly, this paper proposes to fragment each worker's load into small pieces and utilizes all workers' partial computation outputs (PCO) to reduce the overall runtime. While easy-to-implement, the theoretical runtime performance analysis of our PCO scheme is challenging. We present new bounds and asymptotic analysis to prove that our PCO scheme always reduces the overall runtime for any random distribution of workers' speeds, and its performance gain over the MDS scheme can be arbitrarily large under high variability of workers' speeds. Moreover, our analysis shows another advantage: the PCO scheme's performance is robust and insensitive to system parameter variations, while the MDS scheme has to know workers' speeds for carefully optimizing q. Finally, our realistic experiments validate that the PCO scheme reduces the overall runtime from that of the MDS scheme by at least 12.3%, and we implement our PCO scheme for solving a typical machine learning problem of linear regression.

引用

页码：1119 / 1130

页数：12

共 50 条

[1] Coded sparse matrix computation schemes that leverage partial stragglers
Das, Anindya Bijoy
Ramamoorthy, Aditya
2021 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2021, : 1570 - 1575
[2] Coded Sparse Matrix Computation Schemes That Leverage Partial Stragglers
Das, Anindya Bijoy
Ramamoorthy, Aditya
IEEE TRANSACTIONS ON INFORMATION THEORY, 2022, 68 (06) : 4156 - 4181
[3] Accelerating catalysts design by machine learning
Yu, Haishan
Jiang, Jun
SCIENCE BULLETIN, 2020, 65 (19) : 1593 - 1594
[4] Accelerating Chip Design with Machine Learning
Khailany, Brucek
PROCEEDINGS OF THE 2020 ACM/IEEE 2ND WORKSHOP ON MACHINE LEARNING FOR CAD (MLCAD '20), 2020, : 33 - 33
[5] Accelerating Chip Design With Machine Learning
Khailany, Brucek
Ren, Haoxing
Dai, Steve
Godil, Saad
Keller, Ben
Kirby, Robert
Klinefelter, Alicia
Venkatesan, Rangharajan
Zhang, Yanqing
Catanzaro, Bryan
Dally, William J.
IEEE MICRO, 2020, 40 (06) : 23 - 32
[6] Latency Analysis of Coded Computation Schemes over Wireless Networks
Reisizadeh, Amirhossein
Pedarsani, Ramtin
2017 55TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2017, : 1256 - 1263
[7] Machine learning assisted adaptive LDPC coded system design and analysis
Xie, Cong
El-Hajjar, Mohammed
Ng, Soon Xin
IET COMMUNICATIONS, 2024, 18 (01) : 1 - 10
[8] Accelerating Additive Design With Probabilistic Machine Learning
Zhang, Yiming
Karnati, Sreekar
Nag, Soumya
Johnson, Neil
Khan, Genghis
Ribic, Brandon
ASCE-ASME JOURNAL OF RISK AND UNCERTAINTY IN ENGINEERING SYSTEMS PART B-MECHANICAL ENGINEERING, 2022, 8 (01):
[9] Incentive Mechanism Design for Distributed Coded Machine Learning
Ding, Ningning
Fang, Zhixuan
Duan, Lingjie
Huang, Jianwei
IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2021), 2021,
[10] Accelerating the design of lattice structures using machine learning
Gongora, Aldair E.
Friedman, Caleb
Newton, Deirdre K.
Yee, Timothy D.
Doorenbos, Zachary
Giera, Brian
Duoss, Eric B.
Han, Thomas Y. -J.
Sullivan, Kyle
Rodriguez, Jennifer N.
SCIENTIFIC REPORTS, 2024, 14 (01):

← 1 2 3 4 5 →