Efficient Dictionary Learning with Gradient Descent

被引:0
|
作者
Gilboa, Dar [1 ,2 ]
Buchanan, Sam [2 ,3 ]
Wright, John [2 ,3 ]
机构
[1] Columbia Univ, Dept Neurosci, New York, NY 10027 USA
[2] Columbia Univ, Data Sci Inst, New York, NY 10027 USA
[3] Columbia Univ, Dept Elect Engn, New York, NY 10027 USA
基金
美国国家科学基金会;
关键词
SPARSE; RECONSTRUCTION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Randomly initialized first-order optimization algorithms are the method of choice for solving many high-dimensional nonconvex problems in machine learning, yet general theoretical guarantees cannot rule out convergence to critical points of poor objective value. For some highly structured nonconvex problems however, the success of gradient descent can be understood by studying the geometry of the objective. We study one such problem - complete orthogonal dictionary learning, and provide converge guarantees for randomly initialized gradient descent to the neighborhood of a global optimum. The resulting rates scale as low order polynomials in the dimension even though the objective possesses an exponential number of saddle points. This efficient convergence can be viewed as a consequence of negative curvature normal to the stable manifolds associated with saddle points, and we provide evidence that this feature is shared by other nonconvex problems of importance as well.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Efficient learning with robust gradient descent
    Matthew J. Holland
    Kazushi Ikeda
    [J]. Machine Learning, 2019, 108 : 1523 - 1560
  • [2] Efficient learning with robust gradient descent
    Holland, Matthew J.
    Ikeda, Kazushi
    [J]. MACHINE LEARNING, 2019, 108 (8-9) : 1523 - 1560
  • [3] Learning to learn by gradient descent by gradient descent
    Andrychowicz, Marcin
    Denil, Misha
    Colmenarejo, Sergio Gomez
    Hoffman, Matthew W.
    Pfau, David
    Schaul, Tom
    Shillingford, Brendan
    de Freitas, Nando
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [4] Blockwise coordinate descent schemes for efficient and effective dictionary learning
    Liu, Bao-Di
    Wang, Yu-Xiong
    Shen, Bin
    Li, Xue
    Zhang, Yu-Jin
    Wang, Yan-Jiang
    [J]. NEUROCOMPUTING, 2016, 178 : 25 - 35
  • [5] Learning to Learn without Gradient Descent by Gradient Descent
    Chen, Yutian
    Hoffman, Matthew W.
    Colmenarejo, Sergio Gomez
    Denil, Misha
    Lillicrap, Timothy P.
    Botvinick, Matt
    de Freitas, Nando
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [6] Efficient neural codes naturally emerge through gradient descent learning
    Ari S. Benjamin
    Ling-Qi Zhang
    Cheng Qiu
    Alan A. Stocker
    Konrad P. Kording
    [J]. Nature Communications, 13
  • [7] Gradient Descent Using Stochastic Circuits for Efficient Training of Learning Machines
    Liu, Siting
    Jiang, Honglan
    Liu, Leibo
    Han, Jie
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2018, 37 (11) : 2530 - 2541
  • [8] Efficient neural codes naturally emerge through gradient descent learning
    Benjamin, Ari S. S.
    Zhang, Ling-Qi
    Qiu, Cheng
    Stocker, Alan A. A.
    Kording, Konrad P. P.
    [J]. NATURE COMMUNICATIONS, 2022, 13 (01)
  • [9] Learning Fractals by Gradient Descent
    Tu, Cheng-Hao
    Chen, Hong-You
    Carlyn, David
    Chao, Wei-Lun
    [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 2, 2023, : 2456 - 2464
  • [10] Gradient Descent Learning With Floats
    Sun, Tao
    Tang, Ke
    Li, Dongsheng
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (03) : 1763 - 1771