Forward Actor-Critic for Nonlinear Function Approximation in Reinforcement Learning

被引:0
|
作者
Veeriah, Vivek [1 ]
van Seijen, Harm [1 ,2 ]
Sutton, Richard S. [1 ]
机构
[1] Univ Alberta, Dept Comp Sci, Edmonton, AB, Canada
[2] Univ Alberta, Edmonton, AB, Canada
来源
AAMAS'17: PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS | 2017年
基金
加拿大自然科学与工程研究理事会;
关键词
Reinforcement Learning; Actor-Critic; Policy Gradient; Nonlinear Function Approximation; Incremental Learning;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-step methods are important in reinforcement learning (RL). Eligibility traces, the usual way of handling them, works well with linear function approximators. Recently, van Seijen (2016) had introduced a delayed learning approach, without eligibility traces, for handling the multi-step lambda-return with nonlinear function approximators. However, this was limited to action-value methods. In this paper, we extend this approach to handle n-step returns, generalize this approach to policy gradient methods and empirically study the effect of such delayed updates in control tasks. Specifically, we introduce two novel forward actor-critic methods and empirically investigate our proposed methods with the conventional actor-critic method on mountain car and pole-balancing tasks. From our experiments, we observe that forward actor-critic dramatically outperforms the conventional actor-critic in these standard control tasks. Notably, this forward actor-critic method has produced a new class of multi-step RL algorithms without eligibility traces.
引用
收藏
页码:556 / 564
页数:9
相关论文
共 50 条
  • [21] Reinforcement learning with actor-critic for knowledge graph reasoning
    Linli ZHANG
    Dewei LI
    Yugeng XI
    Shuai JIA
    ScienceChina(InformationSciences), 2020, 63 (06) : 223 - 225
  • [22] Addressing Function Approximation Error in Actor-Critic Methods
    Fujimoto, Scott
    van Hoof, Herke
    Meger, David
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [23] Actor-Critic Reinforcement Learning for Control With Stability Guarantee
    Han, Minghao
    Zhang, Lixian
    Wang, Jun
    Pan, Wei
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (04) : 6217 - 6224
  • [24] Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning
    Wu, Yue
    Zhai, Shuangfei
    Srivastava, Nitish
    Susskind, Joshua
    Zhang, Jian
    Salakhutdinov, Ruslan
    Goh, Hanlin
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [25] Deep Actor-Critic Reinforcement Learning for Anomaly Detection
    Zhong, Chen
    Gursoy, M. Cenk
    Velipasalar, Senem
    2019 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2019,
  • [26] MARS: Malleable Actor-Critic Reinforcement Learning Scheduler
    Baheri, Betis
    Tronge, Jacob
    Fang, Bo
    Li, Ang
    Chaudhary, Vipin
    Guan, Qiang
    2022 IEEE INTERNATIONAL PERFORMANCE, COMPUTING, AND COMMUNICATIONS CONFERENCE, IPCCC, 2022,
  • [27] Averaged Soft Actor-Critic for Deep Reinforcement Learning
    Ding, Feng
    Ma, Guanfeng
    Chen, Zhikui
    Gao, Jing
    Li, Peng
    COMPLEXITY, 2021, 2021
  • [28] Provably Efficient Convergence of Primal-Dual Actor-Critic with Nonlinear Function Approximation
    Dong, Jing
    Shen, Li
    Xu, Yinggan
    Wang, Baoxiang
    arXiv, 2022,
  • [29] Optimized Adaptive Nonlinear Tracking Control Using Actor-Critic Reinforcement Learning Strategy
    Wen, Guoxing
    Chen, C. L. Philip
    Ge, Shuzhi Sam
    Yang, Hongli
    Liu, Xiaoguang
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2019, 15 (09) : 4969 - 4977
  • [30] Provably Efficient Convergence of Primal-Dual Actor-Critic with Nonlinear Function Approximation
    Dong, Jing
    Shen, Li
    Xu, Yinggan
    Wang, Baoxiang
    Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS, 2023, 2023-May : 2640 - 2642