GLOBALLY CONVERGENT MULTILEVEL TRAINING OF DEEP RESIDUAL NETWORKS

被引:6
|
作者
Kopanicakova, Alena [1 ]
Krause, Rolf [1 ]
机构
[1] Univ Svizzera italiana, Euler Inst, Lugano, Switzerland
来源
SIAM JOURNAL ON SCIENTIFIC COMPUTING | 2023年 / 45卷 / 03期
基金
瑞士国家科学基金会;
关键词
trustregion methods; multilevel minimization; deep residual networks; training algorithm; TRUST-REGION METHODS; OPTIMIZATION; ALGORITHMS;
D O I
10.1137/21M1434076
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We propose a globally convergent multilevel training method for deep residual networks (ResNets). The devised method can be seen as a novel variant of the recursive multilevel trustregion (RMTR) method, which operates in hybrid (stochastic-deterministic) settings by adaptively adjusting minibatch sizes during the training. The multilevel hierarchy and the transfer operators are constructed by exploiting a dynamical system's viewpoint, which interprets forward propagation through the ResNet as a forward Euler discretization of an initial value problem. In contrast to traditional training approaches, our novel RMTR method also incorporates curvature information on all levels of the multilevel hierarchy by means of the limited-memory SR1 method. The overall performance and the convergence properties of the our multilevel training method are numerically investigated using examples from the field of classification and regression.
引用
收藏
页码:S254 / S280
页数:27
相关论文
共 50 条
  • [31] Convergent decomposition techniques for training RBF neural networks
    Buzzi, C
    Grippo, L
    Sciandrone, M
    NEURAL COMPUTATION, 2001, 13 (08) : 1891 - 1920
  • [32] A Deep Residual Networks Accelerator on FPGA
    Zhao, YaQian
    Zhang, Xin
    Fang, Xing
    Li, Long
    Li, XueLei
    Guo, ZhenHua
    Liu, XuChen
    2019 ELEVENTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTATIONAL INTELLIGENCE (ICACI 2019), 2019, : 13 - 17
  • [33] Convergence analysis of deep residual networks
    Huang, Wentao
    Zhang, Haizhang
    ANALYSIS AND APPLICATIONS, 2024, 22 (02) : 351 - 382
  • [34] Deep Residual Networks for Plankton Classification
    Li, Xiu
    Cui, Zuoying
    OCEANS 2016 MTS/IEEE MONTEREY, 2016,
  • [35] Identity Mappings in Deep Residual Networks
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 : 630 - 645
  • [36] Deep limits of residual neural networks
    Thorpe, Matthew
    van Gennip, Yves
    RESEARCH IN THE MATHEMATICAL SCIENCES, 2023, 10 (01)
  • [37] Scaling Properties of Deep Residual Networks
    Cohen, Alain-Sam
    Cont, Rama
    Rossier, Alain
    Xu, Renyuan
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [38] Diversified Radar Micro-Doppler Simulations as Training Data for Deep Residual Neural Networks
    Seyfioglu, Mehmet S.
    Erol, Baris
    Gurbuz, Sevgi Z.
    Amin, Moeness G.
    2018 IEEE RADAR CONFERENCE (RADARCONF18), 2018, : 612 - 617
  • [39] Training Very Deep Networks
    Srivastava, Rupesh Kumar
    Greff, Klaus
    Schmidhuber, Juergen
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [40] A GLOBALLY CONVERGENT STOCHASTIC-APPROXIMATION
    YAKOWITZ, S
    SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 1993, 31 (01) : 30 - 40