On the Global Optimum Convergence of Momentum-based Policy Gradient

被引:0
|
作者
Ding, Yuhao [1 ]
Zhang, Junzi [2 ]
Lavaei, Javad [1 ]
机构
[1] Univ Calif Berkeley, Berkeley, CA 94720 USA
[2] Amazon Advertising, San Francisco, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Policy gradient (PG) methods are popular and efficient for large-scale reinforcement learning due to their relative stability and incremental nature. In recent years, the empirical success of PG methods has led to the development of a theoretical foundation for these methods. In this work, we generalize this line of research by establishing the first set of global convergence results of stochastic PG methods with momentum terms, which have been demonstrated to be efficient recipes for improving PG methods. We study both the soft-max and the Fishernon-degenerate policy parametrizations, and show that adding a momentum term improves the global optimality sample complexities of vanilla PG methods by (O) over tilde(epsilon(-1.5)) and (O) over tilde(epsilon(-1)), respectively, where epsilon > 0 is the target tolerance. Our results for the generic Fishernon-degenerate policy parametrizations also provide the first single-loop and finite-batch PG algorithm achieving an (O) over tilde (epsilon(-3)) global optimality sample complexity. Finally, as a byproduct, our analyses provide general tools for deriving the global convergence rates of stochastic PG methods, which can be readily applied and extended to other PG estimators under the two parametrizations.
引用
收藏
页数:25
相关论文
共 50 条
  • [11] Ordering-based Conditions for Global Convergence of Policy Gradient Methods
    Mei, Jincheng
    Dai, Bo
    Agarwal, Alekh
    Ghavamzadeh, Mohammad
    Szepesvari, Csaba
    Schuurmans, Dale
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [12] Federated Gradient Averaging for Multi-Site Training with Momentum-Based Optimizers
    Remedios, Samuel W.
    Butman, John A.
    Landman, Bennett A.
    Pham, Dzung L.
    DOMAIN ADAPTATION AND REPRESENTATION TRANSFER, AND DISTRIBUTED AND COLLABORATIVE LEARNING, DART 2020, DCL 2020, 2020, 12444 : 170 - 180
  • [13] Tradeoffs Between Convergence Rate and Noise Amplification for Momentum-Based Accelerated Optimization Algorithms
    Mohammadi, Hesameddin
    Razaviyayn, Meisam
    Jovanovic, Mihailo R.
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2025, 70 (02) : 889 - 904
  • [14] Accelerated Componentwise Gradient Boosting Using Efficient Data Representation and Momentum-Based Optimization
    Schalk, Daniel
    Bischl, Bernd
    Ruegamer, David
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2023, 32 (02) : 631 - 641
  • [15] Nanoswimmer-oriented Direct Targeting Strategy Inspired by Momentum-based Gradient Optimization
    Ali, Muhammad
    Cree, Michael J.
    Sharifi, Neda
    Chen, Yifan
    2019 41ST ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2019, : 741 - 744
  • [16] DREAMPlace 4.0: Timing-driven Global Placement with Momentum-based Net Weighting
    Liao, Peiyu
    Liu, Siting
    Chen, Zhitang
    Lv, Wenlong
    Lin, Yibo
    Yu, Bei
    PROCEEDINGS OF THE 2022 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2022), 2022, : 939 - 944
  • [17] Global Convergence of Policy Gradient Methods for the Linear Quadratic Regulator
    Fazel, Maryam
    Ge, Rong
    Kakade, Sham M.
    Mesbahi, Mehran
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [18] Noise amplifiation of momentum-based optimization algorithms
    Mohammadi, Hesameddin
    Razaviyayn, Meisam
    Jovanovic, Mihailo R.
    2023 AMERICAN CONTROL CONFERENCE, ACC, 2023, : 849 - 854
  • [19] A momentum-based deformation system for granular material
    Zeng, Ya-Lun
    Tan, Charlie Irawan
    Tai, Wen-Kai
    Yang, Mau-Tsuen
    Chiang, Cheng-Chin
    Chang, Chin-Chen
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2007, 18 (4-5) : 289 - 300
  • [20] Superconductivity induced by fluctuations of momentum-based multipoles
    Sumita, Shuntaro
    Yanase, Youichi
    PHYSICAL REVIEW RESEARCH, 2020, 2 (03):