Complexity control by gradient descent in deep networks

被引:0
|
作者
Tomaso Poggio
Qianli Liao
Andrzej Banburski
机构
[1] MIT,Center for Brains, Minds, and Machines
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Overparametrized deep networks predict well, despite the lack of an explicit complexity control during training, such as an explicit regularization term. For exponential-type loss functions, we solve this puzzle by showing an effective regularization effect of gradient descent in terms of the normalized weights that are relevant for classification.
引用
收藏
相关论文
共 50 条
  • [31] Gradient Descent Finds Global Minima for Generalizable Deep Neural Networks of Practical Sizes
    Kawaguchi, Kenji
    Huang, Jiaoyang
    [J]. 2019 57TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2019, : 92 - 99
  • [33] Convergence of stochastic gradient descent under a local Lojasiewicz condition for deep neural networks
    An, Jing
    Lu, Jianfeng
    [J]. arXiv, 2023,
  • [34] On the complexity of backpropagation with momentum and gradient descent on sigmoidal steepness
    Leung, Wing Kai
    [J]. Advances in Neural Networks and Applications, 2001, : 391 - 396
  • [35] Impact of Mathematical Norms on Convergence of Gradient Descent Algorithms for Deep Neural Networks Learning
    Cai, Linzhe
    Yu, Xinghuo
    Li, Chaojie
    Eberhard, Andrew
    Lien Thuy Nguyen
    Chuong Thai Doan
    [J]. AI 2022: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, 13728 : 131 - 144
  • [36] Vehicle networks for gradient descent in a sampled environment
    Bachmayer, R
    Leonard, NE
    [J]. PROCEEDINGS OF THE 41ST IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-4, 2002, : 112 - 117
  • [37] Localization in Wireless Sensor Networks with Gradient Descent
    Qiao, Dapeng
    Pang, Grantham K. H.
    [J]. 2011 IEEE PACIFIC RIM CONFERENCE ON COMMUNICATIONS, COMPUTERS AND SIGNAL PROCESSING (PACRIM), 2011, : 91 - 96
  • [38] Applying Gradient Descent in Convolutional Neural Networks
    Cui, Nan
    [J]. 2ND INTERNATIONAL CONFERENCE ON MACHINE VISION AND INFORMATION TECHNOLOGY (CMVIT 2018), 2018, 1004
  • [39] Exponential Convergence Time of Gradient Descent for One-Dimensional Deep Linear Neural Networks
    Shamir, Ohad
    [J]. CONFERENCE ON LEARNING THEORY, VOL 99, 2019, 99
  • [40] Fractional-order stochastic gradient descent method with momentum and energy for deep neural networks
    Zhou, Xingwen
    You, Zhenghao
    Sun, Weiguo
    Zhao, Dongdong
    Yan, Shi
    [J]. Neural Networks, 2025, 181