Improved Powered Stochastic Optimization Algorithms for Large-Scale Machine Learning

被引：0

作者：

Yang, Zhuang ^{[1
]}

机构：

[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou 215006, Peoples R China

来源：

JOURNAL OF MACHINE LEARNING RESEARCH | 2023年 / 24卷

基金：

中国博士后科学基金;

关键词：

Powerball function; stochastic optimization; variance reduction; adaptive learning rate; non-convex optimization; REGULARIZATION; DESCENT; STEP;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Stochastic optimization, especially stochastic gradient descent (SGD), is now the workhorse for the vast majority of problems in machine learning. Various strategies, e.g., control variates, adaptive learning rate, momentum technique, etc., have been developed to improve canonical SGD that is of a low convergence rate and the poor generalization in practice. Most of these strategies improve SGD that can be attributed to control the updating direction (e.g., gradient descent or gradient ascent direction), or manipulate the learning rate. Along these two lines, this work first develops and analyzes a novel type of improved powered stochastic gradient descent algorithms from the perspectives of variance reduction, where the updating direction was determined by the Powerball function. Additionally, to bridge the gap between powered stochastic optimization (PSO) and the learning rate, which is now still an open problem for PSO, we propose an adaptive mechanism of updating the learning rate that resorts the Barzilai-Borwein (BB) like scheme, not only for the proposed algorithm, but also for classical PSO algorithms. The theoretical properties of the resulting algorithms for non-convex optimization problems are technically analyzed. Empirical tests using various benchmark data sets indicate the efficiency and robustness of our proposed algorithms.

引用

页数：29

共 50 条

[1] Powered stochastic optimization with hypergradient descent for large-scale learning systems
Yang, Zhuang
Li, Xiaotian
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 238
[2] Doubly Stochastic Algorithms for Large-Scale Optimization
Koppel, Alec
Mokhtari, Aryan
Ribeiro, Alejandro
[J]. 2016 50TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, 2016, : 1705 - 1709
[3] Optimization Methods for Large-Scale Machine Learning
Bottou, Leon
Curtis, Frank E.
Nocedal, Jorge
[J]. SIAM REVIEW, 2018, 60 (02) : 223 - 311
[4] Large-Scale Machine Learning with Stochastic Gradient Descent
Bottou, Leon
[J]. COMPSTAT'2010: 19TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL STATISTICS, 2010, : 177 - 186
[5] A Class of Parallel Doubly Stochastic Algorithms for Large-Scale Learning
Mokhtari, Aryan
Koppel, Alec
Takac, Martin
Ribeiro, Alejandro
[J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
[6] Painless Stochastic Conjugate Gradient for Large-Scale Machine Learning
Yang, Zhuang
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, : 1 - 14
[7] Large-Scale Machine Learning Algorithms for Biomedical Data Science
Huang, Heng
[J]. ACM-BCB'19: PROCEEDINGS OF THE 10TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND HEALTH INFORMATICS, 2019, : 4 - 4
[8] Revisiting the Nystrom Method for Improved Large-scale Machine Learning
Gittens, Alex
Mahoney, Michael W.
[J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2016, 17
[9] Large-Scale Machine Learning and Optimization for Bioinformatics Data Analysis
Cheng, Jianlin
[J]. ACM-BCB 2020 - 11TH ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, 2020,
[10] Optimization Algorithms for Large-Scale Systems
Azizan, Navid
[J]. Performance Evaluation Review, 2020, 47 (03): : 2 - 5

← 1 2 3 4 5 →