Finding Second-Order Stationary Points in Nonconvex-Strongly-Concave Minimax Optimization

被引：0

作者：

Luo, Luo ^{[1
]}

Li, Yujun ^{[2
]}

Chen, Cheng ^{[3
]}

机构：

[1] Fudan Univ, Sch Data Sci, Shanghai, Peoples R China

[2] Huawei Technol Co Ltd, Noahs Ark Lab, Shenzhen, Peoples R China

[3] Nanyang Technol Univ, Sch Phys & Math Sci, Singapore, Singapore

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022) | 2022年

基金：

中国国家自然科学基金;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We study the smooth minimax optimization problem minx maxy f(x, y), where f is l-smooth, strongly-concave in y but possibly nonconvex in x. Most of existing works focus on finding the first-order stationary points of the function f(x, y) or its primal function P(x), Delta= max(y) f(x, y), but few of them focus on achieving second-order stationary points. In this paper, we propose a novel approach for minimax optimization, called Minimax Cubic Newton (MCN), which could find an (epsilon, kappa(1.5) root rho epsilon)-second-order stationary point of P(x) with calling O(kappa(1.5)root rho epsilon(-1.5)) times of second-order oracles and (O) over tilde(kappa(2) root rho epsilon(-1.5)) times of first-order oracles, where kappa is the condition number and rho is the Lipschitz continuous constant for the Hessian of f(x, y). In addition, we propose an inexact variant of MCN for high-dimensional problems to avoid calling expensive second-order oracles. Instead, our method solves the cubic sub-problem inexactly via gradient descent and matrix Chebyshev expansion. This strategy still obtains the desired approximate second-order stationary point with high probability but only requires (O) over tilde(kappa(1.5) l epsilon(-2)) Hessian-vector oracle calls and (O) over tilde(kappa(2)root rho epsilon(-1.5)) first-order oracle calls. To the best of our knowledge, this is the first work that considers the non-asymptotic convergence behavior of finding second-order stationary points for minimax problems without the convex-concave assumptions.

引用

页数：13

共 50 条

[1] An Efficient Stochastic Algorithm for Decentralized Nonconvex-Strongly-Concave Minimax Optimization
Chen, Lesi
Ye, Haishan
Luo, Luo
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
[2] Zeroth-order algorithms for nonconvex-strongly-concave minimax problems with improved complexities
Wang, Zhongruo
Balasubramanian, Krishnakumar
Ma, Shiqian
Razaviyayn, Meisam
JOURNAL OF GLOBAL OPTIMIZATION, 2023, 87 (2-4) : 709 - 740
[3] Stochastic Recursive Gradient Descent Ascent for Stochastic Nonconvex-Strongly-Concave Minimax Problems
Luo, Luo
Ye, Haishan
Huang, Zhichao
Zhang, Tong
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[4] Complexity Lower Bounds for Nonconvex-Strongly-Concave Min-Max Optimization
Li, Haochuan
Tian, Yi
Zhang, Jingzhao
Jadbabaie, Ali
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[5] Generalization Bounds of Nonconvex-(Strongly)-Concave Stochastic Minimax Optimization
Zhang, Siqi
Hu, Yifan
Zhang, Liang
He, Niao
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
[6] Sample Complexity of Policy Gradient Finding Second-Order Stationary Points
Yang, Long
Zheng, Qian
Pan, Gang
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 10630 - 10638
[7] Second-Order Online Nonconvex Optimization
Lesage-Landry, Antoine
Taylor, Joshua A.
Shames, Iman
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2021, 66 (10) : 4866 - 4872
[8] Finding Second-Order Stationary Points in Constrained Minimization: A Feasible Direction Approach
Hallak, Nadav
Teboulle, Marc
JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2020, 186 (02) : 480 - 503
[9] Finding Second-Order Stationary Points in Constrained Minimization: A Feasible Direction Approach
Nadav Hallak
Marc Teboulle
Journal of Optimization Theory and Applications, 2020, 186 : 480 - 503
[10] PA-GD: On the Convergence of Perturbed Alternating Gradient Descent to Second-Order Stationary Points for Structured Nonconvex Optimization
Lu, Songtao
Hong, Mingyi
Wang, Zhengdao
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97

← 1 2 3 4 5 →