SparCML: High-Performance Sparse Communication for Machine Learning

被引:61
|
作者
Renggli, Cedric [1 ]
Ashkboos, Saleh [2 ]
Aghagolzadeh, Mehdi [3 ]
Alistarh, Dan [2 ]
Hoefler, Torsten [1 ]
机构
[1] Swiss Fed Inst Technol, Zurich, Switzerland
[2] IST Austria, Vienna, Austria
[3] Microsoft, Redmond, WA USA
基金
欧洲研究理事会;
关键词
Sparse AllReduce; Sparse Input Vectors; Sparse AllGather; OPERATIONS; DESCENT;
D O I
10.1145/3295500.3356222
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Applying machine learning techniques to the quickly growing data in science and industry requires highly-scalable algorithms. Large datasets are most commonly processed "data parallel" distributed across many nodes. Each node's contribution to the overall gradient is summed using a global allreduce. This allreduce is the single communication and thus scalability bottleneck for most machine learning workloads. We observe that frequently, many gradient values are (close to) zero, leading to sparse of sparsifyable communications. To exploit this insight, we analyze, design, and implement a set of communication-efficient protocols for sparse input data, in conjunction with efficient machine learning algorithms which can leverage these primitives. Our communication protocols generalize standard collective operations, by allowing processes to contribute arbitrary sparse input data vectors. Our generic communication library, SPARCML(1), extends MPI to support additional features, such as non-blocking (asynchronous) operations and low-precision data representations. As such, SPARCML and its techniques will form the basis of future highly-scalable machine learning frameworks.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] S2 REDUCER: HIGH-PERFORMANCE SPARSE COMMUNICATION TO ACCELERATE DISTRIBUTED DEEP LEARNING
    Ge, Keshi
    Fu, Yongquan
    Zhang, Yiming
    Lai, Zhiquan
    Deng, Xiaoge
    Li, Dongsheng
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 5233 - 5237
  • [2] Memristive Accelerators for Dense and Sparse Linear Algebra: From Machine Learning to High-Performance Scientific Computing
    Ipek, Engin
    IEEE MICRO, 2019, 39 (01) : 58 - 61
  • [3] Machine learning toward high-performance electrochemical sensors
    Gabriela F. Giordano
    Larissa F. Ferreira
    Ítalo R. S. Bezerra
    Júlia A. Barbosa
    Juliana N. Y. Costa
    Gabriel J. C. Pimentel
    Renato S. Lima
    Analytical and Bioanalytical Chemistry, 2023, 415 : 3683 - 3692
  • [4] Machine learning toward high-performance electrochemical sensors
    Giordano, Gabriela F.
    Ferreira, Larissa F.
    Bezerra, italo R. S.
    Barbosa, Julia A.
    Costa, Juliana N. Y.
    Pimentel, Gabriel J. C.
    Lima, Renato S.
    ANALYTICAL AND BIOANALYTICAL CHEMISTRY, 2023, 415 (18) : 3683 - 3692
  • [5] Machine learning for high-performance solar radiation prediction
    Tanoli, Irfan Khan
    Mehdi, Asqar
    Algarni, Abeer D.
    Fazal, Azra
    Khan, Talha Ahmed
    Ahmad, Sadique
    Ateya, Abdelhamied A.
    ENERGY REPORTS, 2024, 12 : 4794 - 4804
  • [6] Network Support for High-Performance Distributed Machine Learning
    Malandrino, Francesco
    Chiasserini, Carla Fabiana
    Molner, Nuria
    de la Oliva, Antonio
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2023, 31 (01) : 264 - 278
  • [7] Learning Everywhere: Pervasive Machine Learning for Effective High-Performance Computation
    Fox, Geoffrey
    Glazier, James A.
    Kadupitiya, J. C. S.
    Jadhao, Vikram
    Kim, Minje
    Qiu, Judy
    Sluka, James P.
    Somogyi, Endre
    Marathe, Madhav
    Adiga, Abhijin
    Chen, Jiangzhuo
    Beckstein, Oliver
    Jha, Shantenu
    2019 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2019, : 422 - 429
  • [8] High-Performance Concrete Strength Prediction Based on Machine Learning
    Liu, Yanning
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [9] Litz: Elastic Framework for High-Performance Distributed Machine Learning
    Qiao, Aurick
    Aghayev, Abutalib
    Yu, Weiren
    Chen, Haoyang
    Ho, Qirong
    Gibson, Garth A.
    Xing, Eric P.
    PROCEEDINGS OF THE 2018 USENIX ANNUAL TECHNICAL CONFERENCE, 2018, : 631 - 643
  • [10] Applications of artificial intelligence/machine learning to high-performance composites
    Wang, Yifeng
    Wang, Kan
    Zhang, Chuck
    COMPOSITES PART B-ENGINEERING, 2024, 285