ACCELERATED, PARALLEL, AND PROXIMAL COORDINATE DESCENT

被引:157
|
作者
Fercoq, Olivier [1 ]
Richtarik, Peter [2 ]
机构
[1] Telecom ParisTech, Inst Mines Telecom, LTCI, Paris, France
[2] Univ Edinburgh, Sch Math, Edinburgh, Midlothian, Scotland
基金
英国工程与自然科学研究理事会;
关键词
randomized coordinate descent; acceleration; parallel methods; proximal methods; complexity; partial separability; convex optimization; big data; ALGORITHM;
D O I
10.1137/130949993
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We propose a new randomized coordinate descent method for minimizing the sum of convex functions each of which depends on a small number of coordinates only. Our method (APPROX) is simultaneously Accelerated, Parallel, and PROXimal; this is the first time such a method is proposed. In the special case when the number of processors is equal to the number of coordinates, the method converges at the rate 2 (omega) over bar(L) over barR(2)/(k + 1)(2), where k is the iteration counter, (omega) over bar is a data-weighted average degree of separability of the loss function, (L) over bar is the average of Lipschitz constants associated with the coordinates and individual functions in the sum, and R is the distance of the initial point from the minimizer. We show that the method can be implemented without the need to perform full-dimensional vector operations, which is the major bottleneck of accelerated coordinate descent. The fact that the method depends on the average degree of separability, and not on the maximum degree, can be attributed to the use of new safe large stepsizes, leading to improved expected separable overapproximation (ESO). These are of independent interest and can be utilized in all existing parallel randomized coordinate descent algorithms based on the concept of ESO. In special cases, our method recovers several classical and recent algorithms such as simple and accelerated proximal gradient descent, as well serial, parallel, and distributed versions of randomized block coordinate descent. Our bounds match or improve on the best known bounds for these methods.
引用
收藏
页码:1997 / 2023
页数:27
相关论文
共 50 条
  • [21] Parallel Coordinate Descent Algorithms for Sparse Phase Retrieval
    Yang, Yang
    Pesavento, Marius
    Eldar, Yonina C.
    Ottersten, Bjoern
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 7670 - 7674
  • [22] Parallel coordinate descent methods for big data optimization
    Peter Richtárik
    Martin Takáč
    Mathematical Programming, 2016, 156 : 433 - 484
  • [23] Parallel Asynchronous Stochastic Coordinate Descent with Auxiliary Variables
    Yu, Hsiang-Fu
    Hsieh, Cho-Jui
    Dhillon, Inderjit S.
    22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
  • [24] Parallel coordinate descent methods for big data optimization
    Richtarik, Peter
    Takac, Martin
    MATHEMATICAL PROGRAMMING, 2016, 156 (1-2) : 433 - 484
  • [25] EFFICIENCY OF THE ACCELERATED COORDINATE DESCENT METHOD ON STRUCTURED OPTIMIZATION PROBLEMS
    Nesterov, Yurii
    Stich, Sebastian U.
    SIAM JOURNAL ON OPTIMIZATION, 2017, 27 (01) : 110 - 123
  • [26] Accelerated Coordinate Descent with Arbitrary Sampling and Best Rates for Minibatches
    Hanzely, Filip
    Richtarik, Peter
    22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89 : 304 - 312
  • [27] Accelerated Proximal Gradient Descent in Metric Learning for Kernel Regression
    Gonzalez, Hector
    Morell, Carlos
    Ferri, Francesc J.
    PROGRESS IN ARTIFICIAL INTELLIGENCE AND PATTERN RECOGNITION, IWAIPR 2018, 2018, 11047 : 219 - 227
  • [28] Damping proximal coordinate descent algorithm for non-convex regularization
    Pan, Zheng
    Lin, Ming
    Hou, Guangdong
    Zhang, Changshui
    NEUROCOMPUTING, 2015, 152 : 151 - 163
  • [29] MASSIVE MIMO MULTICAST BEAMFORMING VIA ACCELERATED RANDOM COORDINATE DESCENT
    Wang, Shuai
    Cheng, Lei
    Xia, Minghua
    Wu, Yik-Chung
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 4494 - 4498
  • [30] Restarting the accelerated coordinate descent method with a rough strong convexity estimate
    Olivier Fercoq
    Zheng Qu
    Computational Optimization and Applications, 2020, 75 : 63 - 91