Efficient Mini-batch Training for Stochastic Optimization

被引:477
|
作者
Li, Muu [1 ,2 ]
Zhang, Tong [2 ,3 ]
Chen, Yuqiang [2 ]
Smola, Alexander J. [1 ,4 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
[2] Baidu Inc, Beijing, Peoples R China
[3] Rutgers State Univ, New Brunswick, NJ USA
[4] Google Inc, Mountain View, CA USA
关键词
D O I
10.1145/2623330.2623612
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Stochastic gradient descent (SGD) is a popular technique for large-scale optimization problems in machine learning. In order to parallelize SGD, minibatch training needs to be employed to reduce the communication cost. However, an increase in minibatch size typically decreases the rate of convergence. This paper introduces a technique based on approximate optimization of a conservatively regularized objective function within each minibatch. We prove that the convergence rate does not decrease with increasing minibatch size. Experiments demonstrate that with suitable implementations of approximate optimization, the resulting algorithm can outperform standard SGD in many scenarios.
引用
收藏
页码:661 / 670
页数:10
相关论文
共 50 条
  • [1] Mini-batch stochastic subgradient for functional constrained optimization
    Singh, Nitesh Kumar
    Necoara, Ion
    Kungurtsev, Vyacheslav
    [J]. OPTIMIZATION, 2023,
  • [2] An Asynchronous Mini-Batch Algorithm for Regularized Stochastic Optimization
    Feyzmahdavian, Hamid Reza
    Aytekin, Arda
    Johansson, Mikael
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2016, 61 (12) : 3740 - 3754
  • [3] An Asynchronous Mini-batch Algorithm for Regularized Stochastic Optimization
    Feyzmandavian, Hamid Reza
    Aytekin, Arda
    Johansson, Mikael
    [J]. 2015 54TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2015, : 1384 - 1389
  • [4] Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization
    Ghadimi, Saeed
    Lan, Guanghui
    Zhang, Hongchao
    [J]. MATHEMATICAL PROGRAMMING, 2016, 155 (1-2) : 267 - 305
  • [5] Mini-batch stochastic approximation methods for nonconvex stochastic composite optimization
    Saeed Ghadimi
    Guanghui Lan
    Hongchao Zhang
    [J]. Mathematical Programming, 2016, 155 : 267 - 305
  • [6] Confidence Score based Mini-batch Skipping for CNN Training on Mini-batch Training Environment
    Jo, Joongho
    Park, Jongsun
    [J]. 2020 17TH INTERNATIONAL SOC DESIGN CONFERENCE (ISOCC 2020), 2020, : 129 - 130
  • [7] Mini-Batch Stochastic Three-Operator Splitting for Distributed Optimization
    Franci, Barbara
    Staudigl, Mathias
    [J]. IEEE CONTROL SYSTEMS LETTERS, 2022, 6 : 2882 - 2887
  • [8] MBA: Mini-Batch AUC Optimization
    Gultekin, San
    Saha, Avishek
    Ratnaparkhi, Adwait
    Paisley, John
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (12) : 5561 - 5574
  • [9] ClusterEA: Scalable Entity Alignment with Stochastic Training and Normalized Mini-batch Similarities
    Gao, Yunjun
    Liu, Xiaoze
    Wu, Junyang
    Li, Tianyi
    Wang, Pengfei
    Chen, Lu
    [J]. PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 421 - 431
  • [10] Efficient mini-batch stochastic gradient descent with Centroidal Voronoi Tessellation for PDE-constrained optimization under uncertainty
    Chen, Liuhong
    Xiong, Meixin
    Ming, Ju
    He, Xiaoming
    [J]. PHYSICA D-NONLINEAR PHENOMENA, 2024, 467