On Non-local Convergence Analysis of Deep Linear Networks

被引:0
|
作者
Chen, Kun [1 ]
Lin, Dachao [2 ]
Zhang, Zhihua [1 ]
机构
[1] Peking Univ, Sch Math Sci, Beijing, Peoples R China
[2] Peking Univ, Acad Adv Interdisciplinary Studies, Beijing, Peoples R China
关键词
PRINCIPAL COMPONENTS; NEURAL-NETWORKS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we study the non-local convergence properties of deep linear networks. Specifically, under the quadratic loss, we consider optimizing deep linear networks in which there is at least a layer with only one neuron. We describe the convergent point of trajectories with an arbitrary balanced starting point under gradient flow, including the paths which converge to one of the saddle points. We also show specific convergence rates of trajectories that converge to the global minimizers by stages. We conclude that the rates vary from polynomial to linear. As far as we know, our results are the first to give a non-local analysis of deep linear neural networks with arbitrary balanced initialization, rather than the lazy training regime which has dominated the literature of neural networks or the restricted benign initialization.
引用
收藏
页数:27
相关论文
共 50 条
  • [1] Training Linear Neural Networks: Non-Local Convergence and Complexity Results
    Eftekhari, Armin
    [J]. 25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
  • [2] Training Linear Neural Networks: Non-Local Convergence and Complexity Results
    Eftekhari, Armin
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [3] Non-local impact of link failures in linear flow networks
    Strake, Julius
    Kaiser, Franz
    Basiri, Farnaz
    Ronellenfitsch, Henrik
    Witthaut, Dirk
    [J]. NEW JOURNAL OF PHYSICS, 2019, 21 (05):
  • [4] Non-local Neural Networks
    Wang, Xiaolong
    Girshick, Ross
    Gupta, Abhinav
    He, Kaiming
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 7794 - 7803
  • [5] Non-linear non-local Cosmology
    Nunes, N. J.
    Mulryne, D. J.
    [J]. DARK SIDE OF THE UNIVERSE, 2009, 1115 : 329 - 334
  • [6] CONVERGENCE THEOREMS FOR THE NON-LOCAL MEANS FILTER
    Jin, Qiyu
    Grama, Ion
    Liu, Quansheng
    [J]. INVERSE PROBLEMS AND IMAGING, 2018, 12 (04) : 853 - 881
  • [7] Linear analysis of non-local physics in homogeneous turbulent flows
    Mishra, Aashwin Ananda
    Girimaji, Sharath
    [J]. PHYSICS OF FLUIDS, 2019, 31 (03)
  • [8] Surface and non-local effects for non-linear analysis of Timoshenko beams
    Preethi, Kasirajan
    Rajagopal, Amirtham
    Reddy, Junuthula Narasimha
    [J]. INTERNATIONAL JOURNAL OF NON-LINEAR MECHANICS, 2015, 76 : 100 - 111
  • [9] Convergence Analysis for Learning Orthonormal Deep Linear Neural Networks
    Qin, Zhen
    Tan, Xuwei
    Zhu, Zhihui
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 795 - 799
  • [10] Non-Local Graph Neural Networks
    Liu, Meng
    Wang, Zhengyang
    Ji, Shuiwang
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (12) : 10270 - 10276