A fully stochastic second-order trust region method

被引:7
|
作者
Curtis, Frank E. [1 ]
Shi, Rui [1 ]
机构
[1] Lehigh Univ, Dept Ind & Syst Engn, Bethlehem, PA 18015 USA
来源
OPTIMIZATION METHODS & SOFTWARE | 2022年 / 37卷 / 03期
基金
美国国家科学基金会;
关键词
Stochastic optimization; finite-sum optimization; stochastic Newton methods; trust region methods; machine learning; deep neural networks; time series forecasting; QUASI-NEWTON METHOD; OPTIMIZATION METHODS;
D O I
10.1080/10556788.2020.1852403
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
A stochastic second-order trust region method is proposed, which can be viewed as an extension of the trust-region-ish (TRish) algorithm proposed by Curtis et al. [A stochastic trust region algorithm based on careful step normalization. INFORMS J. Optim. 1(3) 200-220, 2019]. In each iteration, a search direction is computed by (approximately) solving a subproblem defined by stochastic gradient and Hessian estimates. The algorithm has convergence guarantees in the fully stochastic regime, i.e. when each stochastic gradient is merely an unbiased estimate of the gradient with bounded variance and the stochastic Hessian estimates are bounded. This framework covers a variety of implementations, such as when the stochastic Hessians are defined by sampled second-order derivatives or diagonal matrices, such as in RMSprop, Adagrad, Adam and other popular algorithms. The proposed algorithm has a worst-case complexity guarantee in the nearly deterministic regime, i.e. when the stochastic gradients and Hessians are close in expectation to the true gradients and Hessians. The results of numerical experiments for training CNNs for image classification and an RNN for time series forecasting are presented. These results show that the algorithm can outperform a stochastic gradient and first-order TRish algorithm.
引用
收藏
页码:844 / 877
页数:34
相关论文
共 50 条
  • [31] A Distributed Second-Order Algorithm You Can Trust
    Dunner, Celestine
    Lucchi, Aurelien
    Gargiani, Matilde
    Bian, An
    Hofmann, Thomas
    Jaggi, Martin
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [32] A SECOND-ORDER CONE BASED APPROACH FOR SOLVING THE TRUST-REGION SUBPROBLEM AND ITS VARIANTS
    Nam Ho-Nguyen
    Kilinc-Karzan, Fatma
    SIAM JOURNAL ON OPTIMIZATION, 2017, 27 (03) : 1485 - 1512
  • [33] Second-order structure function in fully developed turbulence
    Huang, Y. X.
    Schmitt, F. G.
    Lu, Z. M.
    Fougairolles, P.
    Gagne, Y.
    Liu, Y. L.
    PHYSICAL REVIEW E, 2010, 82 (02)
  • [34] SECOND-ORDER FULLY DISCRETIZED PROJECTION METHOD FOR INCOMPRESSIBLE NAVIER-STOKES EQUATIONS
    Guo, Daniel X.
    ELECTRONIC JOURNAL OF DIFFERENTIAL EQUATIONS, 2016, : 9 - 20
  • [35] Augmented Lagrangian method for second-order cone programs under second-order sufficiency
    Hang, Nguyen T. V.
    Mordukhovich, Boris S.
    Sarabi, M. Ebrahim
    JOURNAL OF GLOBAL OPTIMIZATION, 2022, 82 (01) : 51 - 81
  • [36] Augmented Lagrangian method for second-order cone programs under second-order sufficiency
    Nguyen T. V. Hang
    Boris S. Mordukhovich
    M. Ebrahim Sarabi
    Journal of Global Optimization, 2022, 82 : 51 - 81
  • [37] A SMOOTHING PENALIZED SAMPLE AVERAGE APPROXIMATION METHOD FOR STOCHASTIC PROGRAMS WITH SECOND-ORDER STOCHASTIC DOMINANCE CONSTRAINTS
    Sun, Hailin
    Xu, Huifu
    Wang, Yong
    ASIA-PACIFIC JOURNAL OF OPERATIONAL RESEARCH, 2013, 30 (03)
  • [38] SECOND-ORDER METHOD OF EARTHQUAKE PREDICTION
    BAGBY, JP
    TRANSACTIONS-AMERICAN GEOPHYSICAL UNION, 1974, 55 (04): : 222 - 222
  • [39] A second-order particle tracking method
    Seok Lee
    Heung-Jae Lie
    Kyu-Min Song
    Chong-Jeanne Lim
    Ocean Science Journal, 2005, 40 (4) : 201 - 208
  • [40] A second-order stochastic resonance method enhanced by fractional-order derivative for mechanical fault detection
    Zijian Qiao
    Ahmed Elhattab
    Xuedao Shu
    Changbo He
    Nonlinear Dynamics, 2021, 106 : 707 - 723