A momentum accelerated stochastic method and its application on policy search problems

被引:0
|
作者
Boou Jiang [1 ]
Ya-xiang Yuan [2 ]
机构
[1] LSEC,
[2] ICMSEC,undefined
[3] AMSS,undefined
[4] Chniese Academy of Sciences,undefined
[5] University of Chinese Academy of Sciences,undefined
关键词
Stochastic algorithm; Non-convex optimization; Reinforcement learning; 65K05; 90C15; 90C25; 90C30;
D O I
10.1007/s00521-024-10883-y
中图分类号
学科分类号
摘要
With the dramatic increase in model complexity and problem scales in the machine learning area, researches on the first-order stochastic methods and its accelerated variants for non-convex problems have attracted wide research interest. However, most works on convergence analysis of accelerated methods focus on general convex or strongly convex objective functions. In this paper, we consider an accelerated scheme coming from dynamic systems and ordinary differential equations, which has a simpler and more direct form than the traditional scheme. We construct auxiliary sequences of iteration points as analysis tools, which can be interpreted as extension of Nesterov’s estimate sequence in non-convex case. We analyze the convergence property under different cases when momentum parameters are fixed or varying over iterations. For non-smooth and general convex objective functions, we give a relaxed step-size requirement to ensure convergence. For the non-convex policy search problem in classical reinforcement learning, we propose an accelerated stochastic policy gradient method with restart technique and construct numerical experiments to verify its effectiveness.
引用
收藏
页码:5957 / 5973
页数:16
相关论文
共 50 条
  • [31] SEARCH METHOD BY SEQUENTIAL COORDINATE ADDITION AND ITS APPLICATION
    HASEGAWA, K
    WATANABE, K
    NAKAJIMA, M
    ELECTRICAL ENGINEERING IN JAPAN, 1973, 92 (03) : 140 - 147
  • [32] An Unsteady Momentum Source Method and Its Application in Simulation of Hovering Rotor
    Guo J.
    Zhou Z.
    Li X.
    1600, Northwestern Polytechnical University (38): : 571 - 579
  • [33] Method and its application of the momentum model for debris flow risk zoning
    Wei, FQ
    Hu, KH
    Lopez, JL
    Cui, P
    CHINESE SCIENCE BULLETIN, 2003, 48 (06): : 594 - 598
  • [34] Method and its application of the momentum model for debris flow risk zoning
    WEI Fangqiang1
    2. Institute of Fluid Mechanics
    Chinese Science Bulletin, 2003, (06) : 594 - 598
  • [35] A Distributed Accelerated Algorithm Based on a Unified Momentum Method
    Chen, Yawei
    Yang, Qingzhi
    JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2024, 203 (03) : 2908 - 2953
  • [36] A stochastic policy search model for matching behavior
    ZhenBo Cheng
    Yu Zhang
    ZhiDong Deng
    Science China Information Sciences, 2011, 54 : 1430 - 1443
  • [37] A stochastic policy search model for matching behavior
    Cheng ZhenBo
    Zhang Yu
    Deng ZhiDong
    SCIENCE CHINA-INFORMATION SCIENCES, 2011, 54 (07) : 1430 - 1443
  • [38] ACCELERATING THE BENDERS DECOMPOSITION METHOD: APPLICATION TO STOCHASTIC NETWORK DESIGN PROBLEMS
    Rahmaniani, Ragheb
    Crainic, Teodor Gabriel
    Gendreau, Michel
    Rei, Walter
    SIAM JOURNAL ON OPTIMIZATION, 2018, 28 (01) : 875 - 903
  • [39] A stochastic policy search model for matching behavior
    CHENG ZhenBo 1
    2 Department of Computer Science and Technology
    Science China(Information Sciences), 2011, 54 (07) : 1430 - 1443
  • [40] On application of the Stochastic Finite Volume Method in Navier-Stokes problems
    Kaminski, Marcin
    Ossowski, Rafal Leszek
    CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2011, 81 (3-4): : 311 - 333