Multiscale Q-learning with linear function approximation

被引:0
|
作者
Shalabh Bhatnagar
K. Lakshmanan
机构
[1] Indian Institute of Science,Department of Computer Science and Automation
[2] National University of Singapore,Department of Mechanical Engineering
来源
Discrete Event Dynamic Systems | 2016年 / 26卷
关键词
Q-learning with linear function approximation; Reinforcement learning; Stochastic approximation; Ordinary differential equation; Differential inclusion; Multi-stage Stochastic shortest path problem;
D O I
暂无
中图分类号
学科分类号
摘要
We present in this article a two-timescale variant of Q-learning with linear function approximation. Both Q-values and policies are assumed to be parameterized with the policy parameter updated on a faster timescale as compared to the Q-value parameter. This timescale separation is seen to result in significantly improved numerical performance of the proposed algorithm over Q-learning. We show that the proposed algorithm converges almost surely to a closed connected internally chain transitive invariant set of an associated differential inclusion.
引用
收藏
页码:477 / 509
页数:32
相关论文
共 50 条
  • [41] Neural Q-learning
    Stephan ten Hagen
    Ben Kröse
    Neural Computing & Applications, 2003, 12 : 81 - 88
  • [42] Robust Q-Learning
    Ertefaie, Ashkan
    McKay, James R.
    Oslin, David
    Strawderman, Robert L.
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2021, 116 (533) : 368 - 381
  • [43] On Synchronous Binary Log-Linear Learning and Second Order Q-learning
    Hasanbeig, Mohammadhosein
    Pavel, Lacra
    IFAC PAPERSONLINE, 2017, 50 (01): : 8987 - 8992
  • [44] Neural Q-learning
    ten Hagen, S
    Kröse, B
    NEURAL COMPUTING & APPLICATIONS, 2003, 12 (02): : 81 - 88
  • [45] Logistic Q-Learning
    Bas-Serrano, Joan
    Curi, Sebastian
    Krause, Andreas
    Neu, Gergely
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [46] Enhancing Nash Q-learning and Team Q-learning mechanisms by using bottlenecks
    Ghazanfari, Behzad
    Mozayani, Nasser
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2014, 26 (06) : 2771 - 2783
  • [47] Comparison of Deep Q-Learning, Q-Learning and SARSA Reinforced Learning for Robot Local Navigation
    Anas, Hafiq
    Ong, Wee Hong
    Malik, Owais Ahmed
    ROBOT INTELLIGENCE TECHNOLOGY AND APPLICATIONS 6, 2022, 429 : 443 - 454
  • [48] Learning to Play Pac-Xon with Q-Learning and Two Double Q-Learning Variants
    Schilperoort, Jits
    Mak, Ivar
    Drugan, Madalina M.
    Wiering, Marco A.
    2018 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI), 2018, : 1151 - 1158
  • [49] Efficient non-linear control by combining Q-learning with local linear controllers
    Kimura, H
    Kobayashi, S
    MACHINE LEARNING, PROCEEDINGS, 1999, : 210 - 219
  • [50] Improved Q-Learning Method for Linear Discrete-Time Systems
    Chen, Jian
    Wang, Jinhua
    Huang, Jie
    PROCESSES, 2020, 8 (03)