STOCHASTIC GRADIENT DESCENT WITH FINITE SAMPLES SIZES

被引:0
|
作者
Yuan, Kun [1 ]
Ying, Bicheng [1 ]
Vlaski, Stefan [1 ]
Sayed, Ali H. [1 ]
机构
[1] Univ Calif Los Angeles, Dept Elect Engn, Los Angeles, CA 90024 USA
关键词
Online learning; stochastic gradient descent; constant step-size; mini-batch technique; importance sampling; APPROXIMATION;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The minimization of empirical risks over finite sample sizes is an important problem in large-scale machine learning. A variety of algorithms has been proposed in the literature to alleviate the computational burden per iteration at the expense of convergence speed and accuracy. Many of these approaches can be interpreted as stochastic gradient descent algorithms, where data is sampled from particular empirical distributions. In this work, we leverage this interpretation and draw from recent results in the field of online adaptation to derive new tight performance expressions for empirical implementations of stochastic gradient descent, mini-batch gradient descent, and importance sampling. The expressions are exact to first order in the step-size parameter and are tighter than existing bounds. We further quantify the performance gained from employing mini-batch solutions, and propose an optimal importance sampling algorithm to optimize performance.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Accelerated Stochastic Gradient Descent for Minimizing Finite Sums
    Nitanda, Atsushi
    [J]. ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 51, 2016, 51 : 195 - 203
  • [2] Local Optimisation of Nystrom Samples Through Stochastic Gradient Descent
    Hutchings, Matthew
    Gauthier, Bertrand
    [J]. MACHINE LEARNING, OPTIMIZATION, AND DATA SCIENCE, LOD 2022, PT I, 2023, 13810 : 123 - 140
  • [3] Noise and Fluctuation of Finite Learning Rate Stochastic Gradient Descent
    Liu, Kangqiao
    Liu Ziyin
    Ueda, Masahito
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [4] Nonnegative Tensor Completion: step-sizes for an accelerated variation of the stochastic gradient descent
    Liavas, Athanasios P.
    Papagiannakos, Ioannis Marios
    Kolomvakis, Christos
    [J]. 2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 1976 - 1980
  • [5] Unforgeability in Stochastic Gradient Descent
    Baluta, Teodora
    Nikolic, Ivica
    Jain, Racchit
    Aggarwal, Divesh
    Saxena, Prateek
    [J]. PROCEEDINGS OF THE 2023 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, CCS 2023, 2023, : 1138 - 1152
  • [6] Preconditioned Stochastic Gradient Descent
    Li, Xi-Lin
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (05) : 1454 - 1466
  • [7] Stochastic gradient descent tricks
    Bottou, Léon
    [J]. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2012, 7700 LECTURE NO : 421 - 436
  • [8] Stochastic Reweighted Gradient Descent
    El Hanchi, Ayoub
    Stephens, David A.
    Maddison, Chris J.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [9] Byzantine Stochastic Gradient Descent
    Alistarh, Dan
    Allen-Zhu, Zeyuan
    Li, Jerry
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [10] Convergence of Stochastic Gradient Descent for PCA
    Shamir, Ohad
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48