A Progressive Batching L-BFGS Method for Machine Learning

被引:0
|
作者
Bollapragada, Raghu [1 ]
Mudigere, Dheevatsa [2 ]
Nocedal, Jorge [1 ]
Shi, Hao-Jun Michael [1 ]
Tang, Ping Tak Peter [3 ]
机构
[1] Northwestern Univ, Dept Ind Engn & Management Sci, Evanston, IL 60201 USA
[2] Intel Corp, Bangalore, Karnataka, India
[3] Intel Corp, Santa Clara, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The standard L-BFGS method relies on gradient approximations that are not dominated by noise, so that search directions are descent directions, the line search is reliable, and quasi-Newton updating yields useful quadratic models of the objective function. All of this appears to call for a full batch approach, but since small batch sizes give rise to faster algorithms with better generalization properties, L-BFGS is currently not considered an algorithm of choice for large-scale machine learning applications. One need not, however, choose between the two extremes represented by the full batch or highly stochastic regimes, and may instead follow a progressive batching approach in which the sample size increases during the course of the optimization. In this paper, we present a new version of the L-BFGS algorithm that combines three basic components - progressive batching, a stochastic line search, and stable quasi-Newton updating - and that performs well on training logistic regression and deep neural networks. We provide supporting convergence theory for the method.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] A Multi-Batch L-BFGS Method for Machine Learning
    Berahas, Albert S.
    Nocedal, Jorge
    Takac, Martin
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [2] A robust multi-batch L-BFGS method for machine learning*
    Berahas, Albert S.
    Takac, Martin
    [J]. OPTIMIZATION METHODS & SOFTWARE, 2020, 35 (01): : 191 - 219
  • [3] A Method for Stochastic L-BFGS Optimization
    Qi, Peng
    Zhou, Wei
    Han, Jizhong
    [J]. 2017 2ND IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS (ICCCBDA 2017), 2017, : 156 - 160
  • [4] Shifted L-BFGS systems
    Erway, Jennifer B.
    Jain, Vibhor
    Marcia, Roummel F.
    [J]. OPTIMIZATION METHODS & SOFTWARE, 2014, 29 (05): : 992 - 1004
  • [5] A structured L-BFGS method and its application to inverse problems
    Mannel, Florian
    Aggrawal, Hari Om
    Modersitzki, Jan
    [J]. INVERSE PROBLEMS, 2024, 40 (04)
  • [6] A structured L-BFGS method and its application to inverse problems
    Mannel, Florian
    Om Aggrawal, Hari
    Modersitzki, Jan
    [J]. Inverse Problems, 40 (04):
  • [7] Laplace-domain waveform inversion using the l-BFGS method
    Lee, Jongwoo
    Ha, Wansoo
    [J]. GEOSYSTEM ENGINEERING, 2019, 22 (04) : 214 - 224
  • [8] QUASI-NEWTON TYPE OF DIAGONAL UPDATING FOR THE L-BFGS METHOD
    Sahari, M. L.
    Khaldi, R.
    [J]. ACTA MATHEMATICA UNIVERSITATIS COMENIANAE, 2009, 78 (02): : 173 - 181
  • [9] Large-scale distributed L-BFGS
    Najafabadi M.M.
    Khoshgoftaar T.M.
    Villanustre F.
    Holt J.
    [J]. Journal of Big Data, 4 (1)
  • [10] Improving L-BFGS Initialization For Trust-Region Methods In Deep Learning
    Rafati, Jacob
    Marcia, Roummel F.
    [J]. 2018 17TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2018, : 501 - 508