Convergence of Actor-Critic Methods with Multi-Layer Neural Networks

被引:0
|
作者
Tian, Haoxing [1 ]
Paschalidis, Ioannis Ch. [1 ]
Olshevsky, Alex [1 ]
机构
[1] Boston Univ, Dept Elect & Comp Engn, Boston, MA 02215 USA
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The early theory of actor-critic methods considered convergence using linear function approximators for the policy and value functions. Recent work has established convergence using neural network approximators with a single hidden layer. In this work we are taking the natural next step and establish convergence using deep neural networks with an arbitrary number of hidden layers, thus closing a gap between theory and practice. We show that actor-critic updates projected on a ball around the initial condition will converge to a neighborhood where the average of the squared gradients is (O) over tilde (1/root m) + O (epsilon), with m being the width of the neural network and. the approximation quality of the best critic neural network over the projected set.
引用
收藏
页数:43
相关论文
共 50 条
  • [11] Automatic Generation Control Based on Multiple Neural Networks With Actor-Critic Strategy
    Xi, Lei
    Wu, Junnan
    Xu, Yanchun
    Sun, Hongbin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (06) : 2483 - 2493
  • [12] Bi-level Multi-Agent Actor-Critic Methods with Transformers
    Wan, Tianjiao
    Mi, Haibo
    Gao, Zijian
    Zhai, Yuanzhao
    Ding, Bo
    Feng, Dawei
    2023 IEEE INTERNATIONAL CONFERENCE ON JOINT CLOUD COMPUTING, JCC, 2023, : 9 - 16
  • [13] Multi-Agent Actor-Critic for Cooperative Resource Allocation in Vehicular Networks
    Hammami, Nessrine
    Nguyen, Kim Khoa
    PROCEEDINGS OF THE 2022 14TH IFIP WIRELESS AND MOBILE NETWORKING CONFERENCE (WMNC 2022), 2022, : 93 - 100
  • [14] Efficient Model Learning Methods for Actor-Critic Control
    Grondman, Ivo
    Vaandrager, Maarten
    Busoniu, Lucian
    Babuska, Robert
    Schuitema, Erik
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2012, 42 (03): : 591 - 602
  • [15] A unified NDP method for MDPs by actor-critic networks
    Tang Hao
    Chen Dong
    Zhou Lei
    PROCEEDINGS OF THE 24TH CHINESE CONTROL CONFERENCE, VOLS 1 AND 2, 2005, : 1012 - 1016
  • [16] Addressing Function Approximation Error in Actor-Critic Methods
    Fujimoto, Scott
    van Hoof, Herke
    Meger, David
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [17] THE MINIMUM VALUE STATE PROBLEM IN ACTOR-CRITIC NETWORKS
    Velasquez, Alvaro
    Alkhouri, Ismail R.
    Bissey, Brett
    Barak, Lior
    Atia, George K.
    2022 IEEE 32ND INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2022,
  • [18] Actor-Critic Learning Algorithms for Mean-Field Control with Moment Neural Networks
    Pham, Huyen
    Warin, Xavier
    METHODOLOGY AND COMPUTING IN APPLIED PROBABILITY, 2025, 27 (01)
  • [19] Actor-Critic Methods for IRS Design in Correlated Channel Environments: A Closer Look Into the Neural Tangent Kernel of the Critic
    Evmorfos, Spilios
    Petropulu, Athina P.
    Poor, H. Vincent
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2023, 71 : 4029 - 4044
  • [20] Multi-agent Attention Actor-Critic Algorithm for Load Balancing in Cellular Networks
    Kang, Jikun
    Wu, Di
    Wang, Ju
    Hossain, Ekram
    Liu, Xue
    Dedek, Gregory
    ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, : 5160 - 5165