Improved Powered Stochastic Optimization Algorithms for Large-Scale Machine Learning

被引:0
|
作者
Yang, Zhuang [1 ]
机构
[1] Soochow Univ, Sch Comp Sci & Technol, Suzhou 215006, Peoples R China
基金
中国博士后科学基金;
关键词
Powerball function; stochastic optimization; variance reduction; adaptive learning rate; non-convex optimization; REGULARIZATION; DESCENT; STEP;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Stochastic optimization, especially stochastic gradient descent (SGD), is now the workhorse for the vast majority of problems in machine learning. Various strategies, e.g., control variates, adaptive learning rate, momentum technique, etc., have been developed to improve canonical SGD that is of a low convergence rate and the poor generalization in practice. Most of these strategies improve SGD that can be attributed to control the updating direction (e.g., gradient descent or gradient ascent direction), or manipulate the learning rate. Along these two lines, this work first develops and analyzes a novel type of improved powered stochastic gradient descent algorithms from the perspectives of variance reduction, where the updating direction was determined by the Powerball function. Additionally, to bridge the gap between powered stochastic optimization (PSO) and the learning rate, which is now still an open problem for PSO, we propose an adaptive mechanism of updating the learning rate that resorts the Barzilai-Borwein (BB) like scheme, not only for the proposed algorithm, but also for classical PSO algorithms. The theoretical properties of the resulting algorithms for non-convex optimization problems are technically analyzed. Empirical tests using various benchmark data sets indicate the efficiency and robustness of our proposed algorithms.
引用
收藏
页数:29
相关论文
共 50 条
  • [1] Powered stochastic optimization with hypergradient descent for large-scale learning systems
    Yang, Zhuang
    Li, Xiaotian
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 238
  • [2] Doubly Stochastic Algorithms for Large-Scale Optimization
    Koppel, Alec
    Mokhtari, Aryan
    Ribeiro, Alejandro
    [J]. 2016 50TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, 2016, : 1705 - 1709
  • [3] Optimization Methods for Large-Scale Machine Learning
    Bottou, Leon
    Curtis, Frank E.
    Nocedal, Jorge
    [J]. SIAM REVIEW, 2018, 60 (02) : 223 - 311
  • [4] Large-Scale Machine Learning with Stochastic Gradient Descent
    Bottou, Leon
    [J]. COMPSTAT'2010: 19TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL STATISTICS, 2010, : 177 - 186
  • [5] A Class of Parallel Doubly Stochastic Algorithms for Large-Scale Learning
    Mokhtari, Aryan
    Koppel, Alec
    Takac, Martin
    Ribeiro, Alejandro
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
  • [6] Painless Stochastic Conjugate Gradient for Large-Scale Machine Learning
    Yang, Zhuang
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, : 1 - 14
  • [7] Large-Scale Machine Learning Algorithms for Biomedical Data Science
    Huang, Heng
    [J]. ACM-BCB'19: PROCEEDINGS OF THE 10TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND HEALTH INFORMATICS, 2019, : 4 - 4
  • [8] Revisiting the Nystrom Method for Improved Large-scale Machine Learning
    Gittens, Alex
    Mahoney, Michael W.
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2016, 17
  • [9] Large-Scale Machine Learning and Optimization for Bioinformatics Data Analysis
    Cheng, Jianlin
    [J]. ACM-BCB 2020 - 11TH ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, 2020,
  • [10] Optimization Algorithms for Large-Scale Systems
    Azizan, Navid
    [J]. Performance Evaluation Review, 2020, 47 (03): : 2 - 5