Gradient-free methods for non-smooth convex stochastic optimization with heavy-tailed noise on convex compact

被引：0

作者：

Nikita Kornilov

Alexander Gasnikov

Pavel Dvurechensky

Darina Dvinskikh

机构：

[1] Moscow Institute of Physics and Technology,

[2] Weierstrass Institute for Applied Analysis and Stochastics,undefined

[3] HSE University,undefined

[4] Skoltech,undefined

[5] ISP RAS Research Center for Trusted Artificial Intelligence,undefined

来源：

Computational Management Science | 2023年 / 20卷

关键词：

Zeroth-order optimization; Derivative-free optimization; Stochastic optimization; Non-smooth problems; Heavy tails; Gradient clipping; Stochastic mirror descent;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

We present two easy-to-implement gradient-free/zeroth-order methods to optimize a stochastic non-smooth function accessible only via a black-box. The methods are built upon efficient first-order methods in the heavy-tailed case, i.e., when the gradient noise has infinite variance but bounded (1+κ)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(1+\kappa)$$\end{document}-th moment for some κ∈(0,1]\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\kappa \in(0,1]$$\end{document}. The first algorithm is based on the stochastic mirror descent with a particular class of uniformly convex mirror maps which is robust to heavy-tailed noise. The second algorithm is based on the stochastic mirror descent and gradient clipping technique. Additionally, for the objective functions satisfying the r-growth condition, faster algorithms are proposed based on these methods and the restart technique.

引用

共 50 条

[41] Gradient Methods for Non-convex Optimization
Prateek Jain
Journal of the Indian Institute of Science, 2019, 99 : 247 - 256
[42] KKT OPTIMALITY CONDITIONS IN NON-SMOOTH, NON-CONVEX OPTIMIZATION
Sisarat, Nithirat
Wangkeeree, Rabian
Lee, Gue Myung
JOURNAL OF NONLINEAR AND CONVEX ANALYSIS, 2018, 19 (08) : 1319 - 1329
[43] Adaptive Stochastic Gradient Descent Method for Convex and Non-Convex Optimization
Chen, Ruijuan
Tang, Xiaoquan
Li, Xiuting
FRACTAL AND FRACTIONAL, 2022, 6 (12)
[44] Discussion on: "An adaptive gradient law with projection for non-smooth convex boundaries"
Back, Juhoon
Shim, Hyungbo
EUROPEAN JOURNAL OF CONTROL, 2006, 12 (06) : 620 - 621
[45] Online Stochastic Gradient Descent with Arbitrary Initialization Solves Non-smooth, Non-convex Phase Retrieval
Tan, Yan Shuo
Vershynin, Roman
JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
[46] On variance reduction for stochastic smooth convex optimization with multiplicative noise
Alejandro Jofré
Philip Thompson
Mathematical Programming, 2019, 174 : 253 - 292
[47] On variance reduction for stochastic smooth convex optimization with multiplicative noise
Jofre, Alejandro
Thompson, Philip
MATHEMATICAL PROGRAMMING, 2019, 174 (1-2) : 253 - 292
[48] Stochastic Optimization for DC Functions and Non-smooth Non-convex Regularizers with Non-asymptotic Convergence
Xu, Yi
Qi, Qi
Lin, Qihang
Jin, Rong
Yang, Tianbao
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[49] A gradient-free distributed optimization method for convex sum of nonconvex cost functions
Pang, Yipeng
Hu, Guoqiang
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2022, 32 (14) : 8086 - 8101
[50] A method to construct a quasi-normal cone for non-convex and non-smooth set and its applications to non-convex and non-smooth optimization
Li, Hongwei
Zhou, Dequn
Liu, Qinghuai
WCICA 2006: SIXTH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-12, CONFERENCE PROCEEDINGS, 2006, : 1585 - +

← 1 2 3 4 5 →