Beyond Finite Layer Neural Networks: Bridging Deep Architectures and Numerical Differential Equations

被引:0
|
作者
Lu, Yiping [1 ]
Zhong, Aoxiao [2 ]
Li, Quanzheng [2 ,3 ,4 ]
Dong, Bin [4 ,5 ,6 ]
机构
[1] Peking Univ, Sch Math Sci, Beijing, Peoples R China
[2] Harvard Med Sch, Massachusetts Gen Hosp, MGH BWH Ctr Clin Data Sci, Boston, MA 02115 USA
[3] Peking Univ, Ctr Data Sci Hlth & Med, Beijing, Peoples R China
[4] Beijing Inst Big Data Res, Lab Biomed Image Anal, Beijing, Peoples R China
[5] Peking Univ, Beijing Int Ctr Math Res, Beijing, Peoples R China
[6] Peking Univ, Ctr Data Sci, Beijing, Peoples R China
基金
美国国家卫生研究院;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural networks have become the stateof-the-art models in numerous machine learning tasks. However, general guidance to network architecture design is still missing. In our work, we bridge deep neural network design with numerical differential equations. We show that many effective networks, such as ResNet, PolyNet, FractalNet and RevNet, can be interpreted as different numerical discretizations of differential equations. This finding brings us a brand new perspective on the design of effective deep architectures. We can take advantage of the rich knowledge in numerical analysis to guide us in designing new and potentially more effective deep networks. As an example, we propose a linear multi-step architecture (LM-architecture) which is inspired by the linear multi-step method solving ordinary differential equations. The LM-architecture is an effective structure that can be used on any ResNet-like networks. In particular, we demonstrate that LM-ResNet and LM-ResNeXt (i.e. the networks obtained by applying the LM-architecture on ResNet and ResNeXt respectively) can achieve noticeably higher accuracy than ResNet and ResNeXt on both CI-FAR and ImageNet with comparable numbers of trainable parameters. In particular, on both CI-FAR and ImageNet, LM-ResNet/LM-ResNeXt can significantly compress the original networks while maintaining a similar performance. This can be explained mathematically using the concept of modified equation from numerical analysis. Last but not least, we also establish a connection between stochastic control and noise injection in the training process which helps to improve generalization of the networks. Furthermore, by relating stochastic training strategy with stochastic dynamic system, we can easily apply stochastic training to the networks with the LM-architecture. As an example, we introduced stochastic depth to LM-ResNet and achieve significant improvement over the original LM-ResNet on CIFAR10.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] Advanced fractional calculus, differential equations and neural networks: analysis, modeling and numerical computations
    Baleanu, Dumitru
    Karaca, Yeliz
    Vazquez, Luis
    Macias-Diaz, Jorge E.
    PHYSICA SCRIPTA, 2023, 98 (11)
  • [42] THE NUMERICAL-SOLUTION OF LINEAR ORDINARY DIFFERENTIAL-EQUATIONS BY FEEDFORWARD NEURAL NETWORKS
    MEADE, AJ
    FERNANDEZ, AA
    MATHEMATICAL AND COMPUTER MODELLING, 1994, 19 (12) : 1 - 25
  • [43] Solving Parametric Partial Differential Equations with Deep Rectified Quadratic Unit Neural Networks
    Lei, Zhen
    Shi, Lei
    Zeng, Chenyu
    JOURNAL OF SCIENTIFIC COMPUTING, 2022, 93 (03)
  • [44] Adaptive deep neural networks methods for high-dimensional partial differential equations
    Zeng, Shaojie
    Zhang, Zong
    Zou, Qingsong
    JOURNAL OF COMPUTATIONAL PHYSICS, 2022, 463
  • [45] Solving Parametric Partial Differential Equations with Deep Rectified Quadratic Unit Neural Networks
    Zhen Lei
    Lei Shi
    Chenyu Zeng
    Journal of Scientific Computing, 2022, 93
  • [46] Deep Neural Networks Algorithms for Stochastic Control Problems on Finite Horizon: Numerical Applications
    Bachouch, Achref
    Hure, Come
    Langrene, Nicolas
    Huyen Pham
    METHODOLOGY AND COMPUTING IN APPLIED PROBABILITY, 2022, 24 (01) : 143 - 178
  • [47] Deep Neural Networks Algorithms for Stochastic Control Problems on Finite Horizon: Numerical Applications
    Achref Bachouch
    Côme Huré
    Nicolas Langrené
    Huyên Pham
    Methodology and Computing in Applied Probability, 2022, 24 : 143 - 178
  • [48] Neural-Network-Assisted Finite Difference Discretization for Numerical Solution of Partial Differential Equations
    Izsak, Ferenc
    Izsak, Rudolf
    ALGORITHMS, 2023, 16 (09)
  • [49] Solving differential equations with unsupervised neural networks
    Parisi, DR
    Mariani, MC
    Laborde, MA
    CHEMICAL ENGINEERING AND PROCESSING-PROCESS INTENSIFICATION, 2003, 42 (8-9) : 715 - 721
  • [50] Transferable Neural Networks for Partial Differential Equations
    Zezhong Zhang
    Feng Bao
    Lili Ju
    Guannan Zhang
    Journal of Scientific Computing, 2024, 99