SySCD: A System-Aware Parallel Coordinate Descent Algorithm

被引:0
|
作者
Ioannou, Nikolas [1 ]
Mendler-Dunner, Celestine [1 ,2 ]
Parnell, Thomas [1 ]
机构
[1] IBM Res, Zurich, Switzerland
[2] Univ Calif Berkeley, Berkeley, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we propose a novel parallel stochastic coordinate descent (SCD) algorithm with convergence guarantees that exhibits strong scalability. We start by studying a state-of-the-art parallel implementation of SCD and identify scalability as well as system-level performance bottlenecks of the respective implementation. We then take a principled approach to develop a new SCD variant which is designed to avoid the identified system bottlenecks, such as limited scaling due to coherence traffic of model sharing across threads, and inefficient CPU cache accesses. Our proposed system-aware parallel coordinate descent algorithm (SySCD) scales to many cores and across numa nodes, and offers a consistent bottom line speedup in training time of up to x12 compared to an optimized asynchronous parallel SCD algorithm and up to x42, compared to state-of-the-art GLM solvers (scikit-learn, Vowpal Wabbit, and H2O) on a range of datasets and multi-core CPU architectures.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] An asynchronous parallel stochastic coordinate descent algorithm
    Liu, Ji
    Wright, Stephen J.
    Ré, Christopher
    Bittorf, Victor
    Sridhar, Srikrishna
    Journal of Machine Learning Research, 2015, 16 : 285 - 322
  • [2] An Asynchronous Parallel Stochastic Coordinate Descent Algorithm
    Liu, Ji
    Wright, Stephen J.
    Re, Christopher
    Bittorf, Victor
    Sridhar, Srikrishna
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2), 2014, 32 : 469 - 477
  • [3] An Asynchronous Parallel Stochastic Coordinate Descent Algorithm
    Liu, Ji
    Wright, Stephen J.
    Re, Christopher
    Bittorf, Victor
    Sridhar, Srikrishna
    JOURNAL OF MACHINE LEARNING RESEARCH, 2015, 16 : 285 - 322
  • [4] System-Aware Compression
    Dar, Yehuda
    Elad, Michael
    Bruckstein, Alfred M.
    2018 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2018, : 2226 - 2230
  • [5] Towards System-Aware Routes
    Prandtstetter, Matthias
    Seragiotto, Clovis
    COMPUTER AIDED SYSTEMS THEORY - EUROCAST 2017, PT I, 2018, 10671 : 291 - 298
  • [6] On the complexity of parallel coordinate descent
    Tappenden, Rachael
    Takac, Martin
    Richtarik, Peter
    OPTIMIZATION METHODS & SOFTWARE, 2018, 33 (02): : 372 - 395
  • [7] A System-Aware Cyber Security architecture
    Jones, Rick A.
    Horowitz, Barry
    SYSTEMS ENGINEERING, 2012, 15 (02) : 225 - 240
  • [8] A computationally efficient parallel coordinate descent algorithm for MPC: implementation on a PLC
    Necoara, Ion
    Clipici, Dragos N.
    2013 EUROPEAN CONTROL CONFERENCE (ECC), 2013, : 3596 - 3601
  • [9] Asynchronous Parallel Greedy Coordinate Descent
    You, Yang
    Lian, XiangRu
    Liu, Ji
    Yu, Hsiang-Fu
    Dhillon, Inderjit S.
    Demmel, James
    Hsieh, Cho-Jui
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [10] ACCELERATED, PARALLEL, AND PROXIMAL COORDINATE DESCENT
    Fercoq, Olivier
    Richtarik, Peter
    SIAM JOURNAL ON OPTIMIZATION, 2015, 25 (04) : 1997 - 2023