Beyond Finite Layer Neural Networks: Bridging Deep Architectures and Numerical Differential Equations

被引：0

作者：

Lu, Yiping ^{[1
]}

Zhong, Aoxiao ^{[2
]}

Li, Quanzheng ^{[2
,3
,4
]}

Dong, Bin ^{[4
,5
,6
]}

机构：

[1] Peking Univ, Sch Math Sci, Beijing, Peoples R China

[2] Harvard Med Sch, Massachusetts Gen Hosp, MGH BWH Ctr Clin Data Sci, Boston, MA 02115 USA

[3] Peking Univ, Ctr Data Sci Hlth & Med, Beijing, Peoples R China

[4] Beijing Inst Big Data Res, Lab Biomed Image Anal, Beijing, Peoples R China

[5] Peking Univ, Beijing Int Ctr Math Res, Beijing, Peoples R China

[6] Peking Univ, Ctr Data Sci, Beijing, Peoples R China

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80 | 2018年 / 80卷

基金：

美国国家卫生研究院;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep neural networks have become the stateof-the-art models in numerous machine learning tasks. However, general guidance to network architecture design is still missing. In our work, we bridge deep neural network design with numerical differential equations. We show that many effective networks, such as ResNet, PolyNet, FractalNet and RevNet, can be interpreted as different numerical discretizations of differential equations. This finding brings us a brand new perspective on the design of effective deep architectures. We can take advantage of the rich knowledge in numerical analysis to guide us in designing new and potentially more effective deep networks. As an example, we propose a linear multi-step architecture (LM-architecture) which is inspired by the linear multi-step method solving ordinary differential equations. The LM-architecture is an effective structure that can be used on any ResNet-like networks. In particular, we demonstrate that LM-ResNet and LM-ResNeXt (i.e. the networks obtained by applying the LM-architecture on ResNet and ResNeXt respectively) can achieve noticeably higher accuracy than ResNet and ResNeXt on both CI-FAR and ImageNet with comparable numbers of trainable parameters. In particular, on both CI-FAR and ImageNet, LM-ResNet/LM-ResNeXt can significantly compress the original networks while maintaining a similar performance. This can be explained mathematically using the concept of modified equation from numerical analysis. Last but not least, we also establish a connection between stochastic control and noise injection in the training process which helps to improve generalization of the networks. Furthermore, by relating stochastic training strategy with stochastic dynamic system, we can easily apply stochastic training to the networks with the LM-architecture. As an example, we introduced stochastic depth to LM-ResNet and achieve significant improvement over the original LM-ResNet on CIFAR10.

引用

页数：10

共 50 条

[21] A Survey of Accelerator Architectures for Deep Neural Networks
Chen, Yiran
Xie, Yuan
Song, Linghao
Chen, Fan
Tang, Tianqi
ENGINEERING, 2020, 6 (03) : 264 - 274
[22] Neuromorphic Architectures for Spiking Deep Neural Networks
Indiveri, Giacomo
Corradi, Federico
Qiao, Ning
2015 IEEE INTERNATIONAL ELECTRON DEVICES MEETING (IEDM), 2015,
[23] Simplifying Deep Neural Networks for Neuromorphic Architectures
Chung, Jaeyong
Shin, Taehwan
2016 ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2016,
[24] PDE-GCN: Novel Architectures for Graph Neural Networks Motivated by Partial Differential Equations
Eliasof, Moshe
Haber, Eldad
Treister, Eran
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[25] AN ALGORITHM FOR NUMERICAL SOLUTION OF DIFFERENTIAL EQUATIONS USING HARMONY SEARCH AND NEURAL NETWORKS
Yadav, Neha
Thi Thuy Ngo
Kim, Joong Hoon
JOURNAL OF APPLIED ANALYSIS AND COMPUTATION, 2022, 12 (04): : 1277 - 1293
[26] Improved Deep Neural Networks with Domain Decomposition in Solving Partial Differential Equations
Wu, Wei
Feng, Xinlong
Xu, Hui
JOURNAL OF SCIENTIFIC COMPUTING, 2022, 93 (01)
[27] Improved Deep Neural Networks with Domain Decomposition in Solving Partial Differential Equations
Wei Wu
Xinlong Feng
Hui Xu
Journal of Scientific Computing, 2022, 93
[28] DEEP NEURAL NETWORKS WITH FLEXIBLE COMPLEXITY WHILE TRAINING BASED ON NEURAL ORDINARY DIFFERENTIAL EQUATIONS
Luo, Zhengbo
Kamata, Sei-ichiro
Sun, Zitang
Zhou, Weilian
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 1690 - 1694
[29] On the differential equations of recurrent neural networks
Aouiti, Chaouki
Ghanmi, Boulbaba
Miraoui, Mohsen
INTERNATIONAL JOURNAL OF COMPUTER MATHEMATICS, 2021, 98 (07) : 1385 - 1407
[30] Memristor crossbar architectures for implementing deep neural networks
Xiaoyang Liu
Zhigang Zeng
Complex & Intelligent Systems, 2022, 8 : 787 - 802

← 1 2 3 4 5 →