Beyond Finite Layer Neural Networks: Bridging Deep Architectures and Numerical Differential Equations

被引:0
|
作者
Lu, Yiping [1 ]
Zhong, Aoxiao [2 ]
Li, Quanzheng [2 ,3 ,4 ]
Dong, Bin [4 ,5 ,6 ]
机构
[1] Peking Univ, Sch Math Sci, Beijing, Peoples R China
[2] Harvard Med Sch, Massachusetts Gen Hosp, MGH BWH Ctr Clin Data Sci, Boston, MA 02115 USA
[3] Peking Univ, Ctr Data Sci Hlth & Med, Beijing, Peoples R China
[4] Beijing Inst Big Data Res, Lab Biomed Image Anal, Beijing, Peoples R China
[5] Peking Univ, Beijing Int Ctr Math Res, Beijing, Peoples R China
[6] Peking Univ, Ctr Data Sci, Beijing, Peoples R China
基金
美国国家卫生研究院;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural networks have become the stateof-the-art models in numerous machine learning tasks. However, general guidance to network architecture design is still missing. In our work, we bridge deep neural network design with numerical differential equations. We show that many effective networks, such as ResNet, PolyNet, FractalNet and RevNet, can be interpreted as different numerical discretizations of differential equations. This finding brings us a brand new perspective on the design of effective deep architectures. We can take advantage of the rich knowledge in numerical analysis to guide us in designing new and potentially more effective deep networks. As an example, we propose a linear multi-step architecture (LM-architecture) which is inspired by the linear multi-step method solving ordinary differential equations. The LM-architecture is an effective structure that can be used on any ResNet-like networks. In particular, we demonstrate that LM-ResNet and LM-ResNeXt (i.e. the networks obtained by applying the LM-architecture on ResNet and ResNeXt respectively) can achieve noticeably higher accuracy than ResNet and ResNeXt on both CI-FAR and ImageNet with comparable numbers of trainable parameters. In particular, on both CI-FAR and ImageNet, LM-ResNet/LM-ResNeXt can significantly compress the original networks while maintaining a similar performance. This can be explained mathematically using the concept of modified equation from numerical analysis. Last but not least, we also establish a connection between stochastic control and noise injection in the training process which helps to improve generalization of the networks. Furthermore, by relating stochastic training strategy with stochastic dynamic system, we can easily apply stochastic training to the networks with the LM-architecture. As an example, we introduced stochastic depth to LM-ResNet and achieve significant improvement over the original LM-ResNet on CIFAR10.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] A Survey of Accelerator Architectures for Deep Neural Networks
    Chen, Yiran
    Xie, Yuan
    Song, Linghao
    Chen, Fan
    Tang, Tianqi
    ENGINEERING, 2020, 6 (03) : 264 - 274
  • [22] Neuromorphic Architectures for Spiking Deep Neural Networks
    Indiveri, Giacomo
    Corradi, Federico
    Qiao, Ning
    2015 IEEE INTERNATIONAL ELECTRON DEVICES MEETING (IEDM), 2015,
  • [23] Simplifying Deep Neural Networks for Neuromorphic Architectures
    Chung, Jaeyong
    Shin, Taehwan
    2016 ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2016,
  • [24] PDE-GCN: Novel Architectures for Graph Neural Networks Motivated by Partial Differential Equations
    Eliasof, Moshe
    Haber, Eldad
    Treister, Eran
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [25] AN ALGORITHM FOR NUMERICAL SOLUTION OF DIFFERENTIAL EQUATIONS USING HARMONY SEARCH AND NEURAL NETWORKS
    Yadav, Neha
    Thi Thuy Ngo
    Kim, Joong Hoon
    JOURNAL OF APPLIED ANALYSIS AND COMPUTATION, 2022, 12 (04): : 1277 - 1293
  • [26] Improved Deep Neural Networks with Domain Decomposition in Solving Partial Differential Equations
    Wu, Wei
    Feng, Xinlong
    Xu, Hui
    JOURNAL OF SCIENTIFIC COMPUTING, 2022, 93 (01)
  • [27] Improved Deep Neural Networks with Domain Decomposition in Solving Partial Differential Equations
    Wei Wu
    Xinlong Feng
    Hui Xu
    Journal of Scientific Computing, 2022, 93
  • [28] DEEP NEURAL NETWORKS WITH FLEXIBLE COMPLEXITY WHILE TRAINING BASED ON NEURAL ORDINARY DIFFERENTIAL EQUATIONS
    Luo, Zhengbo
    Kamata, Sei-ichiro
    Sun, Zitang
    Zhou, Weilian
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 1690 - 1694
  • [29] On the differential equations of recurrent neural networks
    Aouiti, Chaouki
    Ghanmi, Boulbaba
    Miraoui, Mohsen
    INTERNATIONAL JOURNAL OF COMPUTER MATHEMATICS, 2021, 98 (07) : 1385 - 1407
  • [30] Memristor crossbar architectures for implementing deep neural networks
    Xiaoyang Liu
    Zhigang Zeng
    Complex & Intelligent Systems, 2022, 8 : 787 - 802