Convergence of Actor-Critic Methods with Multi-Layer Neural Networks

被引：0

作者：

Tian, Haoxing ^{[1
]}

Paschalidis, Ioannis Ch. ^{[1
]}

Olshevsky, Alex ^{[1
]}

机构：

[1] Boston Univ, Dept Elect & Comp Engn, Boston, MA 02215 USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The early theory of actor-critic methods considered convergence using linear function approximators for the policy and value functions. Recent work has established convergence using neural network approximators with a single hidden layer. In this work we are taking the natural next step and establish convergence using deep neural networks with an arbitrary number of hidden layers, thus closing a gap between theory and practice. We show that actor-critic updates projected on a ball around the initial condition will converge to a neighborhood where the average of the squared gradients is (O) over tilde (1/root m) + O (epsilon), with m being the width of the neural network and. the approximation quality of the best critic neural network over the projected set.

引用

页数：43

共 50 条

[11] Automatic Generation Control Based on Multiple Neural Networks With Actor-Critic Strategy
Xi, Lei
Wu, Junnan
Xu, Yanchun
Sun, Hongbin
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (06) : 2483 - 2493
[12] Bi-level Multi-Agent Actor-Critic Methods with Transformers
Wan, Tianjiao
Mi, Haibo
Gao, Zijian
Zhai, Yuanzhao
Ding, Bo
Feng, Dawei
2023 IEEE INTERNATIONAL CONFERENCE ON JOINT CLOUD COMPUTING, JCC, 2023, : 9 - 16
[13] Multi-Agent Actor-Critic for Cooperative Resource Allocation in Vehicular Networks
Hammami, Nessrine
Nguyen, Kim Khoa
PROCEEDINGS OF THE 2022 14TH IFIP WIRELESS AND MOBILE NETWORKING CONFERENCE (WMNC 2022), 2022, : 93 - 100
[14] Efficient Model Learning Methods for Actor-Critic Control
Grondman, Ivo
Vaandrager, Maarten
Busoniu, Lucian
Babuska, Robert
Schuitema, Erik
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2012, 42 (03): : 591 - 602
[15] A unified NDP method for MDPs by actor-critic networks
Tang Hao
Chen Dong
Zhou Lei
PROCEEDINGS OF THE 24TH CHINESE CONTROL CONFERENCE, VOLS 1 AND 2, 2005, : 1012 - 1016
[16] Addressing Function Approximation Error in Actor-Critic Methods
Fujimoto, Scott
van Hoof, Herke
Meger, David
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
[17] THE MINIMUM VALUE STATE PROBLEM IN ACTOR-CRITIC NETWORKS
Velasquez, Alvaro
Alkhouri, Ismail R.
Bissey, Brett
Barak, Lior
Atia, George K.
2022 IEEE 32ND INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2022,
[18] Actor-Critic Learning Algorithms for Mean-Field Control with Moment Neural Networks
Pham, Huyen
Warin, Xavier
METHODOLOGY AND COMPUTING IN APPLIED PROBABILITY, 2025, 27 (01)
[19] Actor-Critic Methods for IRS Design in Correlated Channel Environments: A Closer Look Into the Neural Tangent Kernel of the Critic
Evmorfos, Spilios
Petropulu, Athina P.
Poor, H. Vincent
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2023, 71 : 4029 - 4044
[20] Multi-agent Attention Actor-Critic Algorithm for Load Balancing in Cellular Networks
Kang, Jikun
Wu, Di
Wang, Ju
Hossain, Ekram
Liu, Xue
Dedek, Gregory
ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, : 5160 - 5165

← 1 2 3 4 5 →