The Loss Surface of Deep and Wide Neural Networks

被引:0
|
作者
Quynh Nguyen [1 ]
Hein, Matthias [1 ]
机构
[1] Saarland Univ, Dept Math & Comp Sci, Saarbrucken, Germany
基金
欧洲研究理事会;
关键词
LOCAL MINIMA;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While the optimization problem behind deep neural networks is highly non-convex, it is frequently observed in practice that training deep networks seems possible without getting stuck in suboptimal points. It has been argued that this is the case as all local minima are close to being globally optimal. We show that this is (almost) true, in fact almost all local minima are globally optimal, for a fully connected network with squared loss and analytic activation function given that the number of hidden units of one layer of the network is larger than the number of training points and the network structure from this layer on is pyramidal.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Overall Loss for Deep Neural Networks
    Huang, Hai
    Cheng, Senlin
    Xu, Liutong
    TRENDS AND APPLICATIONS IN KNOWLEDGE DISCOVERY AND DATA MINING: PAKDD 2019 WORKSHOPS, 2019, 11607 : 223 - 231
  • [2] Propagation Mechanism for Deep and Wide Neural Networks
    Xu, Dejiang
    Lee, Mong Li
    Hsu, Wynne
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9212 - 9220
  • [3] Deep and Wide Neural Networks Covariance Estimation
    Arratia, Argimiro
    Cabana, Alejandra
    Rafael Leon, Jose
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2020, PT I, 2020, 12396 : 195 - 206
  • [4] Classification with Deep Neural Networks and Logistic Loss
    Zhang, Zihan
    Shi, Lei
    Zhou, Ding-Xuan
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25
  • [5] Wide neural networks with bottlenecks are deep gaussian processes
    Agrawal, Devanshu
    Papamarkou, Theodore
    Hinkle, Jacob
    Journal of Machine Learning Research, 2020, 21
  • [6] Wide and deep neural networks achieve consistency for classification
    Radhakrishnan, Adityanarayanan
    Belkin, Mikhail
    Uhler, Caroline
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2023, 120 (14)
  • [7] Stable behaviour of infinitely wide deep neural networks
    Favaro, Stefano
    Fortini, Sandra
    Peluchetti, Stefano
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 1137 - 1145
  • [8] Wide Neural Networks with Bottlenecks are Deep Gaussian Processes
    Agrawal, Devanshu
    Papamarkou, Theodore
    Hinkle, Jacob
    JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
  • [9] Loss surface of XOR artificial neural networks
    Mehta, Dhagash
    Zhao, Xiaojun
    Bernal, Edgar A.
    Wales, David J.
    PHYSICAL REVIEW E, 2018, 97 (05)
  • [10] Embedding Principle of Loss Landscape of Deep Neural Networks
    Zhang, Yaoyu
    Zhang, Zhongwang
    Luo, Tao
    Xu, Zhi-Qin John
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34