A momentum accelerated stochastic method and its application on policy search problems

被引:0
|
作者
Boou Jiang [1 ]
Ya-xiang Yuan [2 ]
机构
[1] LSEC,
[2] ICMSEC,undefined
[3] AMSS,undefined
[4] Chniese Academy of Sciences,undefined
[5] University of Chinese Academy of Sciences,undefined
关键词
Stochastic algorithm; Non-convex optimization; Reinforcement learning; 65K05; 90C15; 90C25; 90C30;
D O I
10.1007/s00521-024-10883-y
中图分类号
学科分类号
摘要
With the dramatic increase in model complexity and problem scales in the machine learning area, researches on the first-order stochastic methods and its accelerated variants for non-convex problems have attracted wide research interest. However, most works on convergence analysis of accelerated methods focus on general convex or strongly convex objective functions. In this paper, we consider an accelerated scheme coming from dynamic systems and ordinary differential equations, which has a simpler and more direct form than the traditional scheme. We construct auxiliary sequences of iteration points as analysis tools, which can be interpreted as extension of Nesterov’s estimate sequence in non-convex case. We analyze the convergence property under different cases when momentum parameters are fixed or varying over iterations. For non-smooth and general convex objective functions, we give a relaxed step-size requirement to ensure convergence. For the non-convex policy search problem in classical reinforcement learning, we propose an accelerated stochastic policy gradient method with restart technique and construct numerical experiments to verify its effectiveness.
引用
收藏
页码:5957 / 5973
页数:16
相关论文
共 50 条
  • [41] A Benders Decomposition Method for Solving Stochastic Complementarity Problems with an Application in Energy
    S. A. Gabriel
    J. D. Fuller
    Computational Economics, 2010, 35 : 301 - 329
  • [42] A Variable Step Crow Search Algorithm and Its Application in Function Problems
    Fan, Yuqi
    Yang, Huimin
    Wang, Yaping
    Xu, Zunshan
    Lu, Daoxiang
    BIOMIMETICS, 2023, 8 (05)
  • [43] A Benders Decomposition Method for Solving Stochastic Complementarity Problems with an Application in Energy
    Gabriel, S. A.
    Fuller, J. D.
    COMPUTATIONAL ECONOMICS, 2010, 35 (04) : 301 - 329
  • [44] Superposition of Choice Functions and Its Application to Tornado Prediction and Search Problems
    Aleskerov F.
    Demin S.
    Shvydun S.
    SN Computer Science, 2020, 1 (2)
  • [45] A variable metric proximal stochastic gradient method: An application to classification problems
    Cascarano, Pasquale
    Franchinic, Giorgia
    Kobler, Erich
    Porta, Federica
    Sebastiani, Andrea
    EURO JOURNAL ON COMPUTATIONAL OPTIMIZATION, 2024, 12
  • [46] Breadth-first search and its application to image processing problems
    Silvela, J
    Portillo, J
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2001, 10 (08) : 1194 - 1199
  • [47] IMPROVED SEARCH MECHANISM IN ABC AND ITS APPLICATION IN ENGINEERING DESIGN PROBLEMS
    Sharma, Tarun Kumar
    Pant, Millie
    JOURNAL OF ENGINEERING SCIENCE AND TECHNOLOGY, 2015, 10 (01): : 111 - 133
  • [48] Accelerated Proximal Gradient Method with Line Search for Large-Scale Nonconvex Penalty Problems
    Wu, Zhongming
    Wang, Kai
    Zhou, Zhangjin
    ICBDC 2019: PROCEEDINGS OF 2019 4TH INTERNATIONAL CONFERENCE ON BIG DATA AND COMPUTING, 2019, : 281 - 286
  • [49] A new accelerated thermal fatigue experiment method of pistons and its application
    Xiong, Peiyou
    Liu, Shiying
    Li, Ziliang
    Deng, Lijun
    Guo, Jinbao
    Shi, Lei
    Zhang, Jian
    Qiao, Xinqi
    ENGINEERING FAILURE ANALYSIS, 2024, 163
  • [50] An accelerated proximal augmented Lagrangian method and its application in compressive sensing
    Min Sun
    Jing Liu
    Journal of Inequalities and Applications, 2017