ACCELERATED, PARALLEL, AND PROXIMAL COORDINATE DESCENT

被引:157
|
作者
Fercoq, Olivier [1 ]
Richtarik, Peter [2 ]
机构
[1] Telecom ParisTech, Inst Mines Telecom, LTCI, Paris, France
[2] Univ Edinburgh, Sch Math, Edinburgh, Midlothian, Scotland
基金
英国工程与自然科学研究理事会;
关键词
randomized coordinate descent; acceleration; parallel methods; proximal methods; complexity; partial separability; convex optimization; big data; ALGORITHM;
D O I
10.1137/130949993
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We propose a new randomized coordinate descent method for minimizing the sum of convex functions each of which depends on a small number of coordinates only. Our method (APPROX) is simultaneously Accelerated, Parallel, and PROXimal; this is the first time such a method is proposed. In the special case when the number of processors is equal to the number of coordinates, the method converges at the rate 2 (omega) over bar(L) over barR(2)/(k + 1)(2), where k is the iteration counter, (omega) over bar is a data-weighted average degree of separability of the loss function, (L) over bar is the average of Lipschitz constants associated with the coordinates and individual functions in the sum, and R is the distance of the initial point from the minimizer. We show that the method can be implemented without the need to perform full-dimensional vector operations, which is the major bottleneck of accelerated coordinate descent. The fact that the method depends on the average degree of separability, and not on the maximum degree, can be attributed to the use of new safe large stepsizes, leading to improved expected separable overapproximation (ESO). These are of independent interest and can be utilized in all existing parallel randomized coordinate descent algorithms based on the concept of ESO. In special cases, our method recovers several classical and recent algorithms such as simple and accelerated proximal gradient descent, as well serial, parallel, and distributed versions of randomized block coordinate descent. Our bounds match or improve on the best known bounds for these methods.
引用
收藏
页码:1997 / 2023
页数:27
相关论文
共 50 条
  • [41] Accelerated Stochastic Greedy Coordinate Descent by Soft Thresholding Projection onto Simplex
    Song, Chaobing
    Cui, Shaobo
    Jiang, Yong
    Xia, Shu-Tao
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [42] An Accelerated Coordinate Gradient Descent Algorithm for Non-separable Composite Optimization
    Aviad Aberdam
    Amir Beck
    Journal of Optimization Theory and Applications, 2022, 193 : 219 - 246
  • [43] PARALLEL STOCHASTIC ASYNCHRONOUS COORDINATE DESCENT: TIGHT BOUNDS ON THE POSSIBLE PARALLELISM
    Cheung, Yun Kuen
    Cole, Richard J.
    Tao, Yixin
    SIAM JOURNAL ON OPTIMIZATION, 2021, 31 (01) : 448 - 460
  • [44] Even Faster Accelerated Coordinate Descent Using Non-Uniform Sampling
    Allen-Zhu, Zeyuan
    Qu, Zheng
    Richtarik, Peter
    Yuan, Yang
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [45] An Accelerated Coordinate Gradient Descent Algorithm for Non-separable Composite Optimization
    Aberdam, Aviad
    Beck, Amir
    JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2022, 193 (1-3) : 219 - 246
  • [46] Nonconvex Regularized Robust PCA Using the Proximal Block Coordinate Descent Algorithm
    Wen, Fei
    Ying, Rendong
    Liu, Peilin
    Truong, Trieu-Kien
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2019, 67 (20) : 5402 - 5416
  • [47] Sparse Representation and Dictionary Learning Based on Alternating Parallel Coordinate Descent
    Tang, Zunyi
    Tamura, Toshiyo
    Ding, Shuxue
    Li, Zhenni
    2013 INTERNATIONAL JOINT CONFERENCE ON AWARENESS SCIENCE AND TECHNOLOGY & UBI-MEDIA COMPUTING (ICAST-UMEDIA), 2013, : 491 - +
  • [48] Scalable Coordinate Descent Approaches to Parallel Matrix Factorization for Recommender Systems
    Yu, Hsiang-Fu
    Hsieh, Cho-Jui
    Si, Si
    Dhillon, Inderjit
    12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2012), 2012, : 765 - 774
  • [49] Efficient Accelerated Coordinate Descent Methods and Faster Algorithms for Solving Linear Systems
    Lee, Yin Tat
    Sidford, Aaron
    2013 IEEE 54TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS), 2013, : 147 - 156
  • [50] When Cyclic Coordinate Descent Outperforms Randomized Coordinate Descent
    Gurbuzbalaban, Mert
    Ozdaglar, Asuman
    Parrilo, Pablo A.
    Vanli, N. Denizcan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30