GLOBALLY CONVERGENT MULTILEVEL TRAINING OF DEEP RESIDUAL NETWORKS

被引：6

作者：

Kopanicakova, Alena ^{[1
]}

Krause, Rolf ^{[1
]}

机构：

[1] Univ Svizzera italiana, Euler Inst, Lugano, Switzerland

来源：

SIAM JOURNAL ON SCIENTIFIC COMPUTING | 2023年 / 45卷 / 03期

基金：

瑞士国家科学基金会;

关键词：

trustregion methods; multilevel minimization; deep residual networks; training algorithm; TRUST-REGION METHODS; OPTIMIZATION; ALGORITHMS;

D O I：

10.1137/21M1434076

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

We propose a globally convergent multilevel training method for deep residual networks (ResNets). The devised method can be seen as a novel variant of the recursive multilevel trustregion (RMTR) method, which operates in hybrid (stochastic-deterministic) settings by adaptively adjusting minibatch sizes during the training. The multilevel hierarchy and the transfer operators are constructed by exploiting a dynamical system's viewpoint, which interprets forward propagation through the ResNet as a forward Euler discretization of an initial value problem. In contrast to traditional training approaches, our novel RMTR method also incorporates curvature information on all levels of the multilevel hierarchy by means of the limited-memory SR1 method. The overall performance and the convergence properties of the our multilevel training method are numerically investigated using examples from the field of classification and regression.

引用

页码：S254 / S280

页数：27

共 50 条

[1] Residual Networks of Residual Networks: Multilevel Residual Networks
Zhang, Ke
Sun, Miao
Han, Tony X.
Yuan, Xingfang
Guo, Liru
Liu, Tao
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2018, 28 (06) : 1303 - 1314
[2] Training Deep Capsule Networks with Residual Connections
Gugglberger, Josef
Peer, David
Rodriguez-Sanchez, Antonio
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT I, 2021, 12891 : 541 - 552
[3] A globally convergent learning algorithm for PCA neural networks
Ye, M
Yi, Z
Lv, J
NEURAL COMPUTING & APPLICATIONS, 2005, 14 (01): : 18 - 24
[4] A globally convergent learning algorithm for PCA neural networks
Mao Ye
Zhang Yi
JianCheng Lv
Neural Computing & Applications, 2005, 14 : 18 - 24
[5] Layer-Parallel Training of Deep Residual Neural Networks
Guenther, Stefanie
Ruthotto, Lars
Schroder, Jacob B.
Cyr, Eric C.
Gauger, Nicolas R.
SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2020, 2 (01): : 1 - 23
[6] Globally convergent online minimization algorithms for neural network training
Grippo, L
NONLINEAR OPTIMIZATION AND APPLICATIONS, 1996, : 181 - 195
[7] Convergent Block Coordinate Descent for Training Tikhonov Regularized Deep Neural Networks
Zhang, Ziming
Brand, Matthew
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[8] Globally Gated Deep Linear Networks
Li, Qianyi
Sompolinsky, Haim
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[9] A general methodology for designing globally convergent optimization neural networks
Xia, YS
Wang, J
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1998, 9 (06): : 1331 - 1343
[10] Globally and quadratically convergent algorithm for solving nonlinear resistive networks
Yamamura, Kiyotaka, 1600, (09):

← 1 2 3 4 5 →