Comparing dynamics: deep neural networks versus glassy systems

被引:23
|
作者
Baity-Jesi, Marco [1 ,2 ]
Sagun, Levent [3 ,4 ]
Geiger, Mario [4 ]
Spigler, Stefano [3 ,4 ]
Ben Arpus, Gerard [5 ]
Cammarpta, Chiara [6 ]
LeCun, Yann [5 ,7 ,8 ]
Wyart, Matthieu [4 ]
Biroli, Giulio [3 ,9 ]
机构
[1] Eawag, Dept Syst Anal Integrated Assessment & Modelling, Swiss Fed Inst Aquat Sci & Technol, CH-8600 Dubendorf, Switzerland
[2] Columbia Univ, Dept Chem, New York, NY 10027 USA
[3] Univ Paris Saclay, Inst Phys Theor, CEA, CNRS, F-91191 Gif Sur Yvette, France
[4] Ecole Polytech Fed Lausanne, Lausanne, Switzerland
[5] NYU, Courant Inst Math Sci, New York, NY USA
[6] Kings Coll London, Dept Math, London WC2R 2LS, England
[7] NYU, Ctr Data Sci, New York, NY USA
[8] Facebook Inc, Facebook AI Res, New York, NY USA
[9] Sorbonne Univ, PSL Res Univ, CNRS, Lab Phys Stat,Ecole Normale Super, F-75005 Paris, France
基金
瑞士国家科学基金会;
关键词
machine learning;
D O I
10.1088/1742-5468/ab3281
中图分类号
O3 [力学];
学科分类号
08 ; 0801 ;
摘要
We analyze numerically the training dynamics of deep neural networks (DNN) by using methods developed in statistical physics of glassy systems. The two main issues we address are (1) the complexity of the loss landscape and of the dynamics within it, and (2) to what extent DNNs share similarities with glassy systems. Our findings, obtained for different architectures and datasets, suggest that during the training process the dynamics slows down because of an increasingly large number of flat directions. At large times, when the loss is approaching zero, the system diffuses at the bottom of the landscape. Despite some similarities with the dynamics of mean-field glassy systems, in particular, the absence of barrier crossing, we find distinctive dynamical behaviors in the two cases, showing that the statistical properties of the corresponding loss and energy landscapes are different. In contrast, when the network is under-parametrized we observe a typical glassy behavior, thus suggesting the existence of different phases depending on whether the network is under-parametrized or over-parametrized.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Characterizing Learning Dynamics of Deep Neural Networks via Complex Networks
    La Malfa, Emanuele
    La Malfa, Gabriele
    Nicosia, Giuseppe
    Latora, Vito
    2021 IEEE 33RD INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2021), 2021, : 344 - 351
  • [22] A Study of Comparing Deep Neural Networks for Classifying Driver Steering Characteristics
    Ryu H.-J.
    Kim J.K.
    Jung S.
    Journal of Institute of Control, Robotics and Systems, 2023, 29 (11) : 901 - 907
  • [23] Comparing deep and shallow neural networks in forecasting call center arrivals
    Andrea Manno
    Fabrizio Rossi
    Stefano Smriglio
    Luigi Cerone
    Soft Computing, 2023, 27 : 12943 - 12957
  • [24] Comparing the Robustness of Humans and Deep Neural Networks on Facial Expression Recognition
    Leveque, Lucie
    Villoteau, Francois
    Sampaio, Emmanuel V. B.
    Da Silva, Matthieu Perreira
    Le Callet, Patrick
    ELECTRONICS, 2022, 11 (23)
  • [25] Comparing deep and shallow neural networks in forecasting call center arrivals
    Manno, Andrea
    Rossi, Fabrizio
    Smriglio, Stefano
    Cerone, Luigi
    SOFT COMPUTING, 2023, 27 (18) : 12943 - 12957
  • [26] Comparing hybrid systems to design and optimize artificial neural networks
    Castillo, PA
    Arenas, MG
    Merelo, JJ
    Romero, G
    Rateb, F
    Prieto, A
    GENETIC PROGRAMMING, PROCEEDINGS, 2004, 3003 : 240 - 249
  • [27] Shallow Versus Deep Neural Networks in Gear Fault Diagnosis
    Cirrincione, Giansalvo
    Kumar, Rahul Ranjeev
    Mohammadi, Ali
    Kia, Shahin Hedayati
    Barbiero, Pietro
    Ferretti, Jacopo
    IEEE TRANSACTIONS ON ENERGY CONVERSION, 2020, 35 (03) : 1338 - 1347
  • [28] Forecasting of noisy chaotic systems with deep neural networks
    Sangiorgio, Matteo
    Dercole, Fabio
    Guariso, Giorgio
    CHAOS SOLITONS & FRACTALS, 2021, 153
  • [29] Deep learning neural networks: Methods, systems, and applications
    Wei, Qinglai
    Kasabov, Nikola
    Polycarpou, Marios
    Zeng, Zhigang
    NEUROCOMPUTING, 2020, 396 : 130 - 132
  • [30] Survey on Deep Neural Networks in Speech and Vision Systems
    Alam, M.
    Samad, M. D.
    Vidyaratne, L.
    Glandon, A.
    Iftekharuddin, K. M.
    NEUROCOMPUTING, 2020, 417 : 302 - 321