Thinking Outside the Ball: Optimal Learning with Gradient Descent for Generalized Linear Stochastic Convex Optimization

被引：0

作者：

Amir, Idan ^{[1
]}

Livni, Roi ^{[1
]}

Srebro, Nathan ^{[2
]}

机构：

[1] Tel Aviv Univ, Tel Aviv, Israel

[2] Toyota Technol Inst, Chicago, IL USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022 | 2022年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We consider linear prediction with a convex Lipschitz loss, or more generally, stochastic convex optimization problems of generalized linear form, i.e. where each instantaneous loss is a scalar convex function of a linear function. We show that in this setting, early stopped Gradient Descent (GD), without any explicit regularization or projection, ensures excess error at most epsilon (compared to the best possible with unit Euclidean norm) with an optimal, up to logarithmic factors, sample complexity of (O) over tilde (1/epsilon(2)) and only (O) over tilde (1/epsilon(2)) iterations. This contrasts with general stochastic convex optimization, where (O) over tilde (1/epsilon(4)) iterations are needed Amir et al. [2]. The lower iteration complexity is ensured by leveraging uniform convergence rather than stability. But instead of uniform convergence in a norm ball, which we show can guarantee suboptimal learning using Theta(1/epsilon(4)) samples, we rely on uniform convergence in a distribution-dependent ball.

引用

页数：12

共 50 条

[31] Taming Convergence for Asynchronous Stochastic Gradient Descent with Unbounded Delay in Non-Convex Learning
Zhang, Xin
Liu, Jia
Zhu, Zhengyuan
2020 59TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2020, : 3580 - 3585
[32] Optimal Epoch Stochastic Gradient Descent Ascent Methods for Min-Max Optimization
Yan, Yan
Xu, Yi
Lin, Qihang
Liu, Wei
Yang, Tianbao
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[33] GENERALIZED STOCHASTIC GRADIENT LEARNING
Evans, George W.
Honkapohja, Seppo
Williams, Noah
INTERNATIONAL ECONOMIC REVIEW, 2010, 51 (01) : 237 - 262
[34] Mirror Descent Strikes Again: Optimal Stochastic Convex Optimization under Infinite Noise Variance
Vural, Nuri Mert
Yu, Lu
Balasubramanian, Krishnakumar
Volgushev, Stanislav
Erdogdu, Murat A.
CONFERENCE ON LEARNING THEORY, VOL 178, 2022, 178 : 65 - 102
[35] A Modified Stochastic Gradient Descent Optimization Algorithm With Random Learning Rate for Machine Learning and Deep Learning
Duk-Sun Shim
Joseph Shim
International Journal of Control, Automation and Systems, 2023, 21 : 3825 - 3831
[36] A Modified Stochastic Gradient Descent Optimization Algorithm With Random Learning Rate for Machine Learning and Deep Learning
Shim, Duk-Sun
Shim, Joseph
INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2023, 21 (11) : 3825 - 3831
[37] Stochastic Gradient Descent for Linear Systems with Missing Data
Ma, Anna
Needell, Deanna
NUMERICAL MATHEMATICS-THEORY METHODS AND APPLICATIONS, 2019, 12 (01) : 1 - 20
[38] Scaling up stochastic gradient descent for non-convex optimisation
Mohamad, Saad
Alamri, Hamad
Bouchachia, Abdelhamid
MACHINE LEARNING, 2022, 111 (11) : 4039 - 4079
[39] Scaling up stochastic gradient descent for non-convex optimisation
Saad Mohamad
Hamad Alamri
Abdelhamid Bouchachia
Machine Learning, 2022, 111 : 4039 - 4079
[40] Comparison of the Stochastic Gradient Descent Based Optimization Techniques
Yazan, Ersan
Talu, M. Fatih
2017 INTERNATIONAL ARTIFICIAL INTELLIGENCE AND DATA PROCESSING SYMPOSIUM (IDAP), 2017,

← 1 2 3 4 5 →