Escape saddle points by a simple gradient-descent based algorithm

被引:0
|
作者
Zhang, Chenyi [1 ]
Li, Tongyang [2 ,3 ,4 ]
机构
[1] Tsinghua Univ, Inst Interdisciplinary Informat Sci, Beijing, Peoples R China
[2] Peking Univ, Ctr Frontiers Comp Studies, Beijing, Peoples R China
[3] Peking Univ, Sch Comp Sci, Beijing, Peoples R China
[4] MIT, Ctr Theoret Phys, Cambridge, MA USA
关键词
CUBIC REGULARIZATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Escaping saddle points is a central research topic in nonconvex optimization. In this paper, we propose a simple gradient-based algorithm such that for a smooth function f : R-n -> R, it outputs an epsilon-approximate second-order stationary point in O (log n/epsilon(1.75)) iterations. Compared to the previous state-of-the-art algorithms by Jin et al. with (O) over tilde (log(4) n/epsilon(2)) or (O) over tilde (log(6) n/epsilon(1.75)) iterations, our algorithm is polynomially better in terms of log n and matches their complexities in terms of 1/c. For the stochastic setting, our algorithm outputs an epsilon-approximate second-order stationary point in (O) over tilde (log(2) n/epsilon(4)) iterations. Technically, our main contribution is an idea of implementing a robust Hessian power method using only gradients, which can find negative curvature near saddle points and achieve the polynomial speedup in log n compared to the perturbed gradient descent methods. Finally, we also perform numerical experiments that support our results.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Gradient Descent Can Take Exponential Time to Escape Saddle Points
    Du, Simon S.
    Jin, Chi
    Lee, Jason D.
    Jordan, Michael, I
    Poczos, Barnabas
    Singh, Aarti
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [2] SSRGD: Simple Stochastic Recursive Gradient Descent for Escaping Saddle Points
    Li, Zhize
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [3] Quantized Gradient-Descent Algorithm for Distributed Resource Allocation
    Zhou, Hongbing
    Yu, Weiyong
    Yi, Peng
    Hong, Yiguang
    [J]. UNMANNED SYSTEMS, 2019, 7 (02) : 119 - 136
  • [4] Gradient-Descent Algorithm Performance With Reduced Set of Quantized Measurements
    Stankovic, Isidora
    Brajovic, Milos
    Dakovic, Milos
    Ioana, Cornel
    [J]. 2019 8TH MEDITERRANEAN CONFERENCE ON EMBEDDED COMPUTING (MECO), 2019, : 489 - 492
  • [5] Accelerating the Iteratively Preconditioned Gradient-Descent Algorithm using Momentum
    Liu, Tianchen
    Chakrabarti, Kushal
    Chopra, Nikhil
    [J]. 2023 NINTH INDIAN CONTROL CONFERENCE, ICC, 2023, : 68 - 73
  • [6] Revisiting Normalized Gradient Descent: Fast Evasion of Saddle Points
    Murray, Ryan
    Swenson, Brian
    Kar, Soummya
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2019, 64 (11) : 4818 - 4824
  • [7] Gradient-Descent Training for Phase-Based Neurons
    Pavaloiu, Ionel Bujorel
    Dragoi, George
    Vasile, Adrian
    [J]. 2014 18TH INTERNATIONAL CONFERENCE SYSTEM THEORY, CONTROL AND COMPUTING (ICSTCC), 2014, : 874 - 878
  • [8] Reconstruction of Global Ozone Density Data using a Gradient-Descent Algorithm
    Stankovic, Isidora
    Dai, Wei
    [J]. PROCEEDINGS OF ELMAR 2016 - 58TH INTERNATIONAL SYMPOSIUM ELMAR 2016, 2016, : 85 - 88
  • [9] Recurrent neural tracking control based on multivariable robust adaptive gradient-descent training algorithm
    Xu, Zhao
    Song, Qing
    Wang, Danwei
    [J]. NEURAL COMPUTING & APPLICATIONS, 2012, 21 (07): : 1745 - 1755
  • [10] Development of Amari Alpha Divergence-Based Gradient-Descent Least Mean Square Algorithm
    Sharma, Parth
    Pradhan, Pyari Mohan
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2023, 70 (08) : 3194 - 3198