Escape saddle points by a simple gradient-descent based algorithm

被引：0

作者：

Zhang, Chenyi ^{[1
]}

Li, Tongyang ^{[2
,3
,4
]}

机构：

[1] Tsinghua Univ, Inst Interdisciplinary Informat Sci, Beijing, Peoples R China

[2] Peking Univ, Ctr Frontiers Comp Studies, Beijing, Peoples R China

[3] Peking Univ, Sch Comp Sci, Beijing, Peoples R China

[4] MIT, Ctr Theoret Phys, Cambridge, MA USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021) | 2021年 / 34卷

关键词：

CUBIC REGULARIZATION;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Escaping saddle points is a central research topic in nonconvex optimization. In this paper, we propose a simple gradient-based algorithm such that for a smooth function f : R-n -> R, it outputs an epsilon-approximate second-order stationary point in O (log n/epsilon(1.75)) iterations. Compared to the previous state-of-the-art algorithms by Jin et al. with (O) over tilde (log(4) n/epsilon(2)) or (O) over tilde (log(6) n/epsilon(1.75)) iterations, our algorithm is polynomially better in terms of log n and matches their complexities in terms of 1/c. For the stochastic setting, our algorithm outputs an epsilon-approximate second-order stationary point in (O) over tilde (log(2) n/epsilon(4)) iterations. Technically, our main contribution is an idea of implementing a robust Hessian power method using only gradients, which can find negative curvature near saddle points and achieve the polynomial speedup in log n compared to the perturbed gradient descent methods. Finally, we also perform numerical experiments that support our results.

引用

页数：12

共 50 条

[1] Gradient Descent Can Take Exponential Time to Escape Saddle Points
Du, Simon S.
Jin, Chi
Lee, Jason D.
Jordan, Michael, I
Poczos, Barnabas
Singh, Aarti
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[2] SSRGD: Simple Stochastic Recursive Gradient Descent for Escaping Saddle Points
Li, Zhize
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[3] Quantized Gradient-Descent Algorithm for Distributed Resource Allocation
Zhou, Hongbing
Yu, Weiyong
Yi, Peng
Hong, Yiguang
[J]. UNMANNED SYSTEMS, 2019, 7 (02) : 119 - 136
[4] Gradient-Descent Algorithm Performance With Reduced Set of Quantized Measurements
Stankovic, Isidora
Brajovic, Milos
Dakovic, Milos
Ioana, Cornel
[J]. 2019 8TH MEDITERRANEAN CONFERENCE ON EMBEDDED COMPUTING (MECO), 2019, : 489 - 492
[5] Accelerating the Iteratively Preconditioned Gradient-Descent Algorithm using Momentum
Liu, Tianchen
Chakrabarti, Kushal
Chopra, Nikhil
[J]. 2023 NINTH INDIAN CONTROL CONFERENCE, ICC, 2023, : 68 - 73
[6] Revisiting Normalized Gradient Descent: Fast Evasion of Saddle Points
Murray, Ryan
Swenson, Brian
Kar, Soummya
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2019, 64 (11) : 4818 - 4824
[7] Gradient-Descent Training for Phase-Based Neurons
Pavaloiu, Ionel Bujorel
Dragoi, George
Vasile, Adrian
[J]. 2014 18TH INTERNATIONAL CONFERENCE SYSTEM THEORY, CONTROL AND COMPUTING (ICSTCC), 2014, : 874 - 878
[8] Reconstruction of Global Ozone Density Data using a Gradient-Descent Algorithm
Stankovic, Isidora
Dai, Wei
[J]. PROCEEDINGS OF ELMAR 2016 - 58TH INTERNATIONAL SYMPOSIUM ELMAR 2016, 2016, : 85 - 88
[9] Recurrent neural tracking control based on multivariable robust adaptive gradient-descent training algorithm
Xu, Zhao
Song, Qing
Wang, Danwei
[J]. NEURAL COMPUTING & APPLICATIONS, 2012, 21 (07): : 1745 - 1755
[10] Development of Amari Alpha Divergence-Based Gradient-Descent Least Mean Square Algorithm
Sharma, Parth
Pradhan, Pyari Mohan
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2023, 70 (08) : 3194 - 3198

← 1 2 3 4 5 →