Convergence of Actor-Critic Methods with Multi-Layer Neural Networks

被引：0

作者：

Tian, Haoxing ^{[1
]}

Paschalidis, Ioannis Ch. ^{[1
]}

Olshevsky, Alex ^{[1
]}

机构：

[1] Boston Univ, Dept Elect & Comp Engn, Boston, MA 02215 USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The early theory of actor-critic methods considered convergence using linear function approximators for the policy and value functions. Recent work has established convergence using neural network approximators with a single hidden layer. In this work we are taking the natural next step and establish convergence using deep neural networks with an arbitrary number of hidden layers, thus closing a gap between theory and practice. We show that actor-critic updates projected on a ball around the initial condition will converge to a neighborhood where the average of the squared gradients is (O) over tilde (1/root m) + O (epsilon), with m being the width of the neural network and. the approximation quality of the best critic neural network over the projected set.

引用

页数：43

共 50 条

[1] A Critical Point Analysis of Actor-Critic Algorithms with Neural Networks
Gottwald, Martin
Shen, Hao
Diepold, Klaus
IFAC PAPERSONLINE, 2022, 55 (15): : 27 - 32
[2] Improving sample efficiency in Multi-Agent Actor-Critic methods
Ye, Zhenhui
Chen, Yining
Jiang, Xiaohong
Song, Guanghua
Yang, Bowei
Fan, Sheng
APPLIED INTELLIGENCE, 2022, 52 (04) : 3691 - 3704
[3] On Finite-Time Convergence of Actor-Critic Algorithm
Qiu S.
Yang Z.
Ye J.
Wang Z.
IEEE Journal on Selected Areas in Information Theory, 2021, 2 (02): : 652 - 664
[4] A multi-agent reinforcement learning using Actor-Critic methods
Li, Chun-Gui
Wang, Meng
Yuan, Qing-Neng
PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2008, : 878 - 882
[5] Improving sample efficiency in Multi-Agent Actor-Critic methods
Zhenhui Ye
Yining Chen
Xiaohong Jiang
Guanghua Song
Bowei Yang
Sheng Fan
Applied Intelligence, 2022, 52 : 3691 - 3704
[6] Multi-actor mechanism for actor-critic reinforcement learning
Li, Lin
Li, Yuze
Wei, Wei
Zhang, Yujia
Liang, Jiye
INFORMATION SCIENCES, 2023, 647
[7] TD-regularized actor-critic methods
Simone Parisi
Voot Tangkaratt
Jan Peters
Mohammad Emtiyaz Khan
Machine Learning, 2019, 108 : 1467 - 1501
[8] TD-regularized actor-critic methods
Parisi, Simone
Tangkaratt, Voot
Peters, Jan
Khan, Mohammad Emtiyaz
MACHINE LEARNING, 2019, 108 (8-9) : 1467 - 1501
[9] Asynchronous learning for actor-critic neural networks and synchronous triggering for multiplayer system
Wang, Ke
Mu, Chaoxu
ISA TRANSACTIONS, 2022, 129 : 295 - 308
[10] Neural Architecture Search with Synchronous Advantage Actor-Critic Methods and Partial Training
Kyriakides, George
Margaritis, Konstantinos G.
10TH HELLENIC CONFERENCE ON ARTIFICIAL INTELLIGENCE (SETN 2018), 2018,

← 1 2 3 4 5 →