Thinking Outside the Ball: Optimal Learning with Gradient Descent for Generalized Linear Stochastic Convex Optimization

被引:0
|
作者
Amir, Idan [1 ]
Livni, Roi [1 ]
Srebro, Nathan [2 ]
机构
[1] Tel Aviv Univ, Tel Aviv, Israel
[2] Toyota Technol Inst, Chicago, IL USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider linear prediction with a convex Lipschitz loss, or more generally, stochastic convex optimization problems of generalized linear form, i.e. where each instantaneous loss is a scalar convex function of a linear function. We show that in this setting, early stopped Gradient Descent (GD), without any explicit regularization or projection, ensures excess error at most epsilon (compared to the best possible with unit Euclidean norm) with an optimal, up to logarithmic factors, sample complexity of (O) over tilde (1/epsilon(2)) and only (O) over tilde (1/epsilon(2)) iterations. This contrasts with general stochastic convex optimization, where (O) over tilde (1/epsilon(4)) iterations are needed Amir et al. [2]. The lower iteration complexity is ensured by leveraging uniform convergence rather than stability. But instead of uniform convergence in a norm ball, which we show can guarantee suboptimal learning using Theta(1/epsilon(4)) samples, we rely on uniform convergence in a distribution-dependent ball.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Generalized stochastic Frank-Wolfe algorithm with stochastic "substitute" gradient for structured convex optimization
    Lu, Haihao
    Freund, Robert M.
    MATHEMATICAL PROGRAMMING, 2021, 187 (1-2) : 317 - 349
  • [22] Limitations of Information-Theoretic Generalization Bounds for Gradient Descent Methods in Stochastic Convex Optimization
    Haghifam, Mahdi
    Rodriguez-Galvez, Borja
    Thobaben, Ragnar
    Skoglund, Mikael
    Roy, Daniel M.
    Dziugaite, Gintare Karolina
    INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 201, 2023, 201 : 663 - 706
  • [23] Stochastic gradient descent for optimization for nuclear systems
    Austin Williams
    Noah Walton
    Austin Maryanski
    Sandra Bogetic
    Wes Hines
    Vladimir Sobes
    Scientific Reports, 13
  • [24] Ant colony optimization and stochastic gradient descent
    Meuleau, N
    Dorigo, M
    ARTIFICIAL LIFE, 2002, 8 (02) : 103 - 121
  • [25] Stochastic gradient descent for wind farm optimization
    Quick, Julian
    Rethore, Pierre-Elouan
    Pedersen, Mads Molgaard
    Rodrigues, Rafael Valotta
    Friis-Moller, Mikkel
    WIND ENERGY SCIENCE, 2023, 8 (08) : 1235 - 1250
  • [26] Stochastic Chebyshev Gradient Descent for Spectral Optimization
    Han, Insu
    Avron, Haim
    Shin, Jinwoo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [27] Stochastic gradient descent for optimization for nuclear systems
    Williams, Austin
    Walton, Noah
    Maryanski, Austin
    Bogetic, Sandra
    Hines, Wes
    Sobes, Vladimir
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [28] Efficient displacement convex optimization with particle gradient descent
    Daneshmand, Hadi
    Lee, Jason D.
    Jin, Chi
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
  • [29] Evolutionary Gradient Descent for Non-convex Optimization
    Xue, Ke
    Qian, Chao
    Xu, Ling
    Fei, Xudong
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 3221 - 3227
  • [30] Online convex optimization in the bandit setting: gradient descent without a gradient
    Flaxman, Abraham D.
    Kalai, Adam Tauman
    McMahan, H. Brendan
    PROCEEDINGS OF THE SIXTEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2005, : 385 - 394