Convergence of Actor-Critic Methods with Multi-Layer Neural Networks

被引:0
|
作者
Tian, Haoxing [1 ]
Paschalidis, Ioannis Ch. [1 ]
Olshevsky, Alex [1 ]
机构
[1] Boston Univ, Dept Elect & Comp Engn, Boston, MA 02215 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The early theory of actor-critic methods considered convergence using linear function approximators for the policy and value functions. Recent work has established convergence using neural network approximators with a single hidden layer. In this work we are taking the natural next step and establish convergence using deep neural networks with an arbitrary number of hidden layers, thus closing a gap between theory and practice. We show that actor-critic updates projected on a ball around the initial condition will converge to a neighborhood where the average of the squared gradients is (O) over tilde (1/root m) + O (epsilon), with m being the width of the neural network and. the approximation quality of the best critic neural network over the projected set.
引用
收藏
页数:43
相关论文
共 50 条
  • [1] A Critical Point Analysis of Actor-Critic Algorithms with Neural Networks
    Gottwald, Martin
    Shen, Hao
    Diepold, Klaus
    IFAC PAPERSONLINE, 2022, 55 (15): : 27 - 32
  • [2] Improving sample efficiency in Multi-Agent Actor-Critic methods
    Ye, Zhenhui
    Chen, Yining
    Jiang, Xiaohong
    Song, Guanghua
    Yang, Bowei
    Fan, Sheng
    APPLIED INTELLIGENCE, 2022, 52 (04) : 3691 - 3704
  • [3] On Finite-Time Convergence of Actor-Critic Algorithm
    Qiu S.
    Yang Z.
    Ye J.
    Wang Z.
    IEEE Journal on Selected Areas in Information Theory, 2021, 2 (02): : 652 - 664
  • [4] A multi-agent reinforcement learning using Actor-Critic methods
    Li, Chun-Gui
    Wang, Meng
    Yuan, Qing-Neng
    PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 878 - 882
  • [5] Improving sample efficiency in Multi-Agent Actor-Critic methods
    Zhenhui Ye
    Yining Chen
    Xiaohong Jiang
    Guanghua Song
    Bowei Yang
    Sheng Fan
    Applied Intelligence, 2022, 52 : 3691 - 3704
  • [6] Multi-actor mechanism for actor-critic reinforcement learning
    Li, Lin
    Li, Yuze
    Wei, Wei
    Zhang, Yujia
    Liang, Jiye
    INFORMATION SCIENCES, 2023, 647
  • [7] TD-regularized actor-critic methods
    Simone Parisi
    Voot Tangkaratt
    Jan Peters
    Mohammad Emtiyaz Khan
    Machine Learning, 2019, 108 : 1467 - 1501
  • [8] TD-regularized actor-critic methods
    Parisi, Simone
    Tangkaratt, Voot
    Peters, Jan
    Khan, Mohammad Emtiyaz
    MACHINE LEARNING, 2019, 108 (8-9) : 1467 - 1501
  • [9] Asynchronous learning for actor-critic neural networks and synchronous triggering for multiplayer system
    Wang, Ke
    Mu, Chaoxu
    ISA TRANSACTIONS, 2022, 129 : 295 - 308
  • [10] Neural Architecture Search with Synchronous Advantage Actor-Critic Methods and Partial Training
    Kyriakides, George
    Margaritis, Konstantinos G.
    10TH HELLENIC CONFERENCE ON ARTIFICIAL INTELLIGENCE (SETN 2018), 2018,