Stochastic Gradient Descent on a Tree: an Adaptive and Robust Approach to Stochastic Convex Optimization

被引:0
|
作者
Vakili, Sattar [1 ]
Salgia, Sudeep [2 ]
Zhao, Qing [2 ]
机构
[1] Prowlerio, Cambridge, England
[2] Cornell Univ, Sch Elect & Comp Engn, Ithaca, NY 14853 USA
基金
美国国家科学基金会;
关键词
D O I
10.1109/allerton.2019.8919740
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Online minimization of an unknown convex function over the interval [0, 1] is considered under first-order stochastic bandit feedback, which returns a random realization of the gradient of the function at each query point. Without knowing the distribution of the random gradients, a learning algorithm sequentially chooses query points with the objective of minimizing regret defined as the expected cumulative loss of the function values at the query points in excess to the minimum value of the function. An approach based on devising a biased random walk on an infinite-depth binary tree constructed through successive partitioning of the domain of the function is developed. Each move of the random walk is guided by a sequential test based on confidence bounds on the empirical mean constructed using the law of the iterated logarithm. With no tuning parameters, this learning algorithm is robust to heavy-tailed noise with infinite variance and adaptive to unknown function characteristics (specifically, convex, strongly convex, and nonsmooth). It achieves the corresponding optimal regret orders (up to a root log T or a log log T factor) in each class of functions and offers better or matching regret orders than the classical stochastic gradient descent approach which requires the knowledge of the function characteristics for tuning the sequence of step-sizes.
引用
收藏
页码:432 / 438
页数:7
相关论文
共 50 条
  • [11] On the Convergence of Stochastic Gradient Descent with Adaptive Stepsizes
    Li, Xiaoyu
    Orabona, Francesco
    22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
  • [12] Linear Convergence of Adaptive Stochastic Gradient Descent
    Xie, Yuege
    Wu, Xiaoxia
    Ward, Rachel
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108
  • [13] Adaptive stochastic parallel gradient descent approach for efficient fiber coupling
    Hu, Qintao
    Zhen, Liangli
    Yao, Mao
    Zhu, Shiwei
    Zhou, Xi
    Zhou, Guozhong
    OPTICS EXPRESS, 2020, 28 (09) : 13141 - 13154
  • [14] Stochastic gradient descent for optimization for nuclear systems
    Austin Williams
    Noah Walton
    Austin Maryanski
    Sandra Bogetic
    Wes Hines
    Vladimir Sobes
    Scientific Reports, 13
  • [15] Ant colony optimization and stochastic gradient descent
    Meuleau, N
    Dorigo, M
    ARTIFICIAL LIFE, 2002, 8 (02) : 103 - 121
  • [16] Stochastic gradient descent for wind farm optimization
    Quick, Julian
    Rethore, Pierre-Elouan
    Pedersen, Mads Molgaard
    Rodrigues, Rafael Valotta
    Friis-Moller, Mikkel
    WIND ENERGY SCIENCE, 2023, 8 (08) : 1235 - 1250
  • [17] Stochastic Chebyshev Gradient Descent for Spectral Optimization
    Han, Insu
    Avron, Haim
    Shin, Jinwoo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [18] Stochastic gradient descent for optimization for nuclear systems
    Williams, Austin
    Walton, Noah
    Maryanski, Austin
    Bogetic, Sandra
    Hines, Wes
    Sobes, Vladimir
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [19] BAYESIAN STOCHASTIC GRADIENT DESCENT FOR STOCHASTIC OPTIMIZATION WITH STREAMING INPUT DATA
    Liu, Tianyi
    Lin, Yifan
    Zhou, Enlu
    SIAM JOURNAL ON OPTIMIZATION, 2024, 34 (01) : 389 - 418
  • [20] Optimization of stochastic parallel gradient descent algorithm for adaptive optics in atmospheric turbulence
    Chen B.
    Li X.
    Jiang W.
    Zhongguo Jiguang/Chinese Journal of Lasers, 2010, 37 (04): : 959 - 964