STOCHASTIC GRADIENT DESCENT WITH FINITE SAMPLES SIZES

被引：0

作者：

Yuan, Kun ^{[1
]}

Ying, Bicheng ^{[1
]}

Vlaski, Stefan ^{[1
]}

Sayed, Ali H. ^{[1
]}

机构：

[1] Univ Calif Los Angeles, Dept Elect Engn, Los Angeles, CA 90024 USA

来源：

2016 IEEE 26TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP) | 2016年

关键词：

Online learning; stochastic gradient descent; constant step-size; mini-batch technique; importance sampling; APPROXIMATION;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The minimization of empirical risks over finite sample sizes is an important problem in large-scale machine learning. A variety of algorithms has been proposed in the literature to alleviate the computational burden per iteration at the expense of convergence speed and accuracy. Many of these approaches can be interpreted as stochastic gradient descent algorithms, where data is sampled from particular empirical distributions. In this work, we leverage this interpretation and draw from recent results in the field of online adaptation to derive new tight performance expressions for empirical implementations of stochastic gradient descent, mini-batch gradient descent, and importance sampling. The expressions are exact to first order in the step-size parameter and are tighter than existing bounds. We further quantify the performance gained from employing mini-batch solutions, and propose an optimal importance sampling algorithm to optimize performance.

引用

页数：6

共 50 条

[1] Accelerated Stochastic Gradient Descent for Minimizing Finite Sums
Nitanda, Atsushi
[J]. ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 51, 2016, 51 : 195 - 203
[2] Local Optimisation of Nystrom Samples Through Stochastic Gradient Descent
Hutchings, Matthew
Gauthier, Bertrand
[J]. MACHINE LEARNING, OPTIMIZATION, AND DATA SCIENCE, LOD 2022, PT I, 2023, 13810 : 123 - 140
[3] Noise and Fluctuation of Finite Learning Rate Stochastic Gradient Descent
Liu, Kangqiao
Liu Ziyin
Ueda, Masahito
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[4] Nonnegative Tensor Completion: step-sizes for an accelerated variation of the stochastic gradient descent
Liavas, Athanasios P.
Papagiannakos, Ioannis Marios
Kolomvakis, Christos
[J]. 2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 1976 - 1980
[5] Unforgeability in Stochastic Gradient Descent
Baluta, Teodora
Nikolic, Ivica
Jain, Racchit
Aggarwal, Divesh
Saxena, Prateek
[J]. PROCEEDINGS OF THE 2023 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, CCS 2023, 2023, : 1138 - 1152
[6] Preconditioned Stochastic Gradient Descent
Li, Xi-Lin
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (05) : 1454 - 1466
[7] Stochastic gradient descent tricks
Bottou, Léon
[J]. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2012, 7700 LECTURE NO : 421 - 436
[8] Stochastic Reweighted Gradient Descent
El Hanchi, Ayoub
Stephens, David A.
Maddison, Chris J.
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[9] Byzantine Stochastic Gradient Descent
Alistarh, Dan
Allen-Zhu, Zeyuan
Li, Jerry
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[10] Convergence of Stochastic Gradient Descent for PCA
Shamir, Ohad
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48

← 1 2 3 4 5 →