GLOBALLY CONVERGENT MULTILEVEL TRAINING OF DEEP RESIDUAL NETWORKS

被引:6
|
作者
Kopanicakova, Alena [1 ]
Krause, Rolf [1 ]
机构
[1] Univ Svizzera italiana, Euler Inst, Lugano, Switzerland
来源
SIAM JOURNAL ON SCIENTIFIC COMPUTING | 2023年 / 45卷 / 03期
基金
瑞士国家科学基金会;
关键词
trustregion methods; multilevel minimization; deep residual networks; training algorithm; TRUST-REGION METHODS; OPTIMIZATION; ALGORITHMS;
D O I
10.1137/21M1434076
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We propose a globally convergent multilevel training method for deep residual networks (ResNets). The devised method can be seen as a novel variant of the recursive multilevel trustregion (RMTR) method, which operates in hybrid (stochastic-deterministic) settings by adaptively adjusting minibatch sizes during the training. The multilevel hierarchy and the transfer operators are constructed by exploiting a dynamical system's viewpoint, which interprets forward propagation through the ResNet as a forward Euler discretization of an initial value problem. In contrast to traditional training approaches, our novel RMTR method also incorporates curvature information on all levels of the multilevel hierarchy by means of the limited-memory SR1 method. The overall performance and the convergence properties of the our multilevel training method are numerically investigated using examples from the field of classification and regression.
引用
收藏
页码:S254 / S280
页数:27
相关论文
共 50 条
  • [21] A globally convergent frequency estimator
    Hsu, L
    Ortega, R
    Damm, G
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1999, 44 (04) : 698 - 713
  • [22] Globally convergent evolution strategies
    Diouane, Y.
    Gratton, S.
    Vicente, L. N.
    MATHEMATICAL PROGRAMMING, 2015, 152 (1-2) : 467 - 490
  • [23] A globally convergent algorithm for MPCC
    Kadrani, Abdeslam
    Dussault, Jean Pierre
    Benchakroun, Abdelhamid
    EURO JOURNAL ON COMPUTATIONAL OPTIMIZATION, 2015, 3 (03) : 263 - 296
  • [24] A globally convergent frequency estimator
    Hsu, L
    Ortega, R
    Damm, G
    CONTROL OF OSCILLATIONS AND CHAOS - 1997 1ST INTERNATIONAL CONFERENCE, PROCEEDINGS, VOLS 1-3, 1997, : 252 - 257
  • [25] A GLOBALLY CONVERGENT ADAPTIVE PREDICTOR
    GOODWIN, GC
    RAMADGE, PJ
    CAINES, PE
    AUTOMATICA, 1981, 17 (01) : 135 - 140
  • [26] Globally convergent evolution strategies
    Y. Diouane
    S. Gratton
    L. N. Vicente
    Mathematical Programming, 2015, 152 : 467 - 490
  • [27] Deep Residual Networks of Residual Networks for Image Super-Resolution
    Wei, Xueqi
    Yang, Fumeng
    Wu, Congzhong
    LIDAR IMAGING DETECTION AND TARGET RECOGNITION 2017, 2017, 10605
  • [28] A globally convergent mathematical model for synthesizing topologically constrained water recycle networks
    Chakraborty, A.
    COMPUTERS & CHEMICAL ENGINEERING, 2009, 33 (07) : 1279 - 1288
  • [29] Training Very Deep Networks via Residual Learning with Stochastic Input Shortcut Connections
    Oyedotun, Oyebade K.
    Shabayek, Abd El Rahman
    Aouada, Djamila
    Ottersten, Bjoern
    NEURAL INFORMATION PROCESSING (ICONIP 2017), PT II, 2017, 10635 : 23 - 33
  • [30] Simplified residual structure and fast deep residual networks
    Yang H.-J.
    Wang E.-S.
    Sui Y.-X.
    Yan F.
    Zhou Y.
    Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2022, 52 (06): : 1413 - 1421