Exact learning dynamics of deep linear networks with prior knowledge

被引:0
|
作者
Braun, Lukas [1 ]
Domine, Clementine C. J. [2 ]
Fitzgerald, James E. [3 ]
Saxe, Andrew M. [2 ,4 ,5 ]
机构
[1] Univ Oxford, Dept Expt Psychol, Oxford, England
[2] UCL, Gatsby Computat Neurosci Unit, London, England
[3] Janelia Res Campus, Howard Hughes Med Inst, Ashburn, VA USA
[4] UCL, Sainsbury Wellcome Ctr, London, England
[5] CIFAR, Toronto, ON, Canada
基金
英国惠康基金; 英国医学研究理事会;
关键词
CONNECTIONIST MODELS; NEURAL-NETWORKS; SYSTEMS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning in deep neural networks is known to depend critically on the knowledge embedded in the initial network weights. However, few theoretical results have precisely linked prior knowledge to learning dynamics. Here we derive exact solutions to the dynamics of learning with rich prior knowledge in deep linear networks by generalising Fukumizu's matrix Riccati solution [1]. We obtain explicit expressions for the evolving network function, hidden representational similarity, and neural tangent kernel over training for a broad class of initialisations and tasks. The expressions reveal a class of task-independent initialisations that radically alter learning dynamics from slow non-linear dynamics to fast exponential trajectories while converging to a global optimum with identical representational similarity, dissociating learning trajectories from the structure of initial internal representations. We characterise how network weights dynamically align with task structure, rigorously justifying why previous solutions successfully described learning from small initial weights without incorporating their fine-scale structure. Finally, we discuss the implications of these findings for continual learning, reversal learning and learning of structured knowledge. Taken together, our results provide a mathematical toolkit for understanding the impact of prior knowledge on deep learning.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] Learning Shared Knowledge for Deep Lifelong Learning Using Deconvolutional Networks
    Lee, Seungwon
    Stokes, James
    Eaton, Eric
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2837 - 2844
  • [32] Data Repair Without Prior Knowledge Using Deep Convolutional Neural Networks
    Qie, Youtian
    Song, Ping
    Hao, Chuangbo
    IEEE ACCESS, 2020, 8 (08): : 105351 - 105361
  • [33] Learning sparse deep neural networks with a spike-and-slab prior
    Sun, Yan
    Song, Qifan
    Liang, Faming
    STATISTICS & PROBABILITY LETTERS, 2022, 180
  • [34] DYNAMICS OF LEARNING IN LINEAR FEATURE-DISCOVERY NETWORKS
    LEEN, TK
    NETWORK-COMPUTATION IN NEURAL SYSTEMS, 1991, 2 (01) : 85 - 105
  • [35] Anomalous diffusion dynamics of learning in deep neural networks
    Chen, Guozhang
    Qu, Cheng Kevin
    Gong, Pulin
    NEURAL NETWORKS, 2022, 149 : 18 - 28
  • [36] Learning Graph Dynamics using Deep Neural Networks
    Narayan, Apurva
    Roe, Peter H. O'N
    IFAC PAPERSONLINE, 2018, 51 (02): : 433 - 438
  • [37] Characterizing Learning Dynamics of Deep Neural Networks via Complex Networks
    La Malfa, Emanuele
    La Malfa, Gabriele
    Nicosia, Giuseppe
    Latora, Vito
    2021 IEEE 33RD INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2021), 2021, : 344 - 351
  • [38] Asymptotic Expansion as Prior Knowledge in Deep Learning Method for High dimensional BSDEs
    Fujii, Masaaki
    Takahashi, Akihiko
    Takahashi, Masayuki
    ASIA-PACIFIC FINANCIAL MARKETS, 2019, 26 (03) : 391 - 408
  • [39] Deep Reinforcement Learning for Time Optimal Velocity Control using Prior Knowledge
    Hartmann, Gabriel
    Shiller, Zvi
    Azaria, Amos
    2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 186 - 193
  • [40] Prior Knowledge-Guided Deep Learning Algorithms for Metantenna Design (Invited)
    Liu, Peiqin
    Chen, Zhi Ning
    2024 IEEE INTERNATIONAL WORKSHOP ON ANTENNA TECHNOLOGY, IWAT, 2024, : 11 - 13