Stochastic Gradient Descent on a Tree: an Adaptive and Robust Approach to Stochastic Convex Optimization

被引:0
|
作者
Vakili, Sattar [1 ]
Salgia, Sudeep [2 ]
Zhao, Qing [2 ]
机构
[1] Prowlerio, Cambridge, England
[2] Cornell Univ, Sch Elect & Comp Engn, Ithaca, NY 14853 USA
基金
美国国家科学基金会;
关键词
D O I
10.1109/allerton.2019.8919740
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Online minimization of an unknown convex function over the interval [0, 1] is considered under first-order stochastic bandit feedback, which returns a random realization of the gradient of the function at each query point. Without knowing the distribution of the random gradients, a learning algorithm sequentially chooses query points with the objective of minimizing regret defined as the expected cumulative loss of the function values at the query points in excess to the minimum value of the function. An approach based on devising a biased random walk on an infinite-depth binary tree constructed through successive partitioning of the domain of the function is developed. Each move of the random walk is guided by a sequential test based on confidence bounds on the empirical mean constructed using the law of the iterated logarithm. With no tuning parameters, this learning algorithm is robust to heavy-tailed noise with infinite variance and adaptive to unknown function characteristics (specifically, convex, strongly convex, and nonsmooth). It achieves the corresponding optimal regret orders (up to a root log T or a log log T factor) in each class of functions and offers better or matching regret orders than the classical stochastic gradient descent approach which requires the knowledge of the function characteristics for tuning the sequence of step-sizes.
引用
收藏
页码:432 / 438
页数:7
相关论文
共 50 条
  • [31] A new non-adaptive optimization method: Stochastic gradient descent with momentum and difference
    Wei Yuan
    Fei Hu
    Liangfu Lu
    Applied Intelligence, 2022, 52 : 3939 - 3953
  • [32] Bandwidth estimation for adaptive optical systems based on stochastic parallel gradient descent optimization
    Yu, M
    Vorontsov, MA
    ADVANCED WAVEFRONT CONTROL: METHODS, DEVICES, AND APPLICATIONS II, 2004, 5553 : 189 - 199
  • [33] Stochastic gradient descent and fast relaxation to thermodynamic equilibrium: A stochastic control approach
    Breiten, Tobias
    Hartmann, Carsten
    Neureither, Lara
    Sharma, Upanshu
    JOURNAL OF MATHEMATICAL PHYSICS, 2021, 62 (12)
  • [34] Robust decentralized stochastic gradient descent over unstable networks
    Zheng, Yanwei
    Zhang, Liangxu
    Chen, Shuzhen
    Zhang, Xiao
    Cai, Zhipeng
    Cheng, Xiuzhen
    COMPUTER COMMUNICATIONS, 2023, 203 : 163 - 179
  • [35] Robust and Fast Learning of Sparse Codes With Stochastic Gradient Descent
    Labusch, Kai
    Barth, Erhardt
    Martinetz, Thomas
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2011, 5 (05) : 1048 - 1060
  • [36] Adaptive Stochastic Convex Optimization Over Networks
    Towfic, Zaid J.
    Sayed, Ali H.
    2013 51ST ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2013, : 1272 - 1277
  • [37] Optimal distributed stochastic mirror descent for strongly convex optimization
    Yuan, Deming
    Hong, Yiguang
    Ho, Daniel W. C.
    Jiang, Guoping
    AUTOMATICA, 2018, 90 : 196 - 203
  • [38] Scaling up stochastic gradient descent for non-convex optimisation
    Mohamad, Saad
    Alamri, Hamad
    Bouchachia, Abdelhamid
    MACHINE LEARNING, 2022, 111 (11) : 4039 - 4079
  • [39] Scaling up stochastic gradient descent for non-convex optimisation
    Saad Mohamad
    Hamad Alamri
    Abdelhamid Bouchachia
    Machine Learning, 2022, 111 : 4039 - 4079
  • [40] Algorithms of Inertial Mirror Descent in Convex Problems of Stochastic Optimization
    Nazin, A. V.
    AUTOMATION AND REMOTE CONTROL, 2018, 79 (01) : 78 - 88