GLOBALLY CONVERGENT MULTILEVEL TRAINING OF DEEP RESIDUAL NETWORKS

被引:6
|
作者
Kopanicakova, Alena [1 ]
Krause, Rolf [1 ]
机构
[1] Univ Svizzera italiana, Euler Inst, Lugano, Switzerland
来源
SIAM JOURNAL ON SCIENTIFIC COMPUTING | 2023年 / 45卷 / 03期
基金
瑞士国家科学基金会;
关键词
trustregion methods; multilevel minimization; deep residual networks; training algorithm; TRUST-REGION METHODS; OPTIMIZATION; ALGORITHMS;
D O I
10.1137/21M1434076
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We propose a globally convergent multilevel training method for deep residual networks (ResNets). The devised method can be seen as a novel variant of the recursive multilevel trustregion (RMTR) method, which operates in hybrid (stochastic-deterministic) settings by adaptively adjusting minibatch sizes during the training. The multilevel hierarchy and the transfer operators are constructed by exploiting a dynamical system's viewpoint, which interprets forward propagation through the ResNet as a forward Euler discretization of an initial value problem. In contrast to traditional training approaches, our novel RMTR method also incorporates curvature information on all levels of the multilevel hierarchy by means of the limited-memory SR1 method. The overall performance and the convergence properties of the our multilevel training method are numerically investigated using examples from the field of classification and regression.
引用
收藏
页码:S254 / S280
页数:27
相关论文
共 50 条
  • [1] Residual Networks of Residual Networks: Multilevel Residual Networks
    Zhang, Ke
    Sun, Miao
    Han, Tony X.
    Yuan, Xingfang
    Guo, Liru
    Liu, Tao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2018, 28 (06) : 1303 - 1314
  • [2] Training Deep Capsule Networks with Residual Connections
    Gugglberger, Josef
    Peer, David
    Rodriguez-Sanchez, Antonio
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT I, 2021, 12891 : 541 - 552
  • [3] A globally convergent learning algorithm for PCA neural networks
    Ye, M
    Yi, Z
    Lv, J
    NEURAL COMPUTING & APPLICATIONS, 2005, 14 (01): : 18 - 24
  • [4] A globally convergent learning algorithm for PCA neural networks
    Mao Ye
    Zhang Yi
    JianCheng Lv
    Neural Computing & Applications, 2005, 14 : 18 - 24
  • [5] Layer-Parallel Training of Deep Residual Neural Networks
    Guenther, Stefanie
    Ruthotto, Lars
    Schroder, Jacob B.
    Cyr, Eric C.
    Gauger, Nicolas R.
    SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2020, 2 (01): : 1 - 23
  • [6] Globally convergent online minimization algorithms for neural network training
    Grippo, L
    NONLINEAR OPTIMIZATION AND APPLICATIONS, 1996, : 181 - 195
  • [7] Convergent Block Coordinate Descent for Training Tikhonov Regularized Deep Neural Networks
    Zhang, Ziming
    Brand, Matthew
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [8] Globally Gated Deep Linear Networks
    Li, Qianyi
    Sompolinsky, Haim
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [9] A general methodology for designing globally convergent optimization neural networks
    Xia, YS
    Wang, J
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 1998, 9 (06): : 1331 - 1343