Multiscale Q-learning with linear function approximation

被引:0
|
作者
Shalabh Bhatnagar
K. Lakshmanan
机构
[1] Indian Institute of Science,Department of Computer Science and Automation
[2] National University of Singapore,Department of Mechanical Engineering
来源
关键词
Q-learning with linear function approximation; Reinforcement learning; Stochastic approximation; Ordinary differential equation; Differential inclusion; Multi-stage Stochastic shortest path problem;
D O I
暂无
中图分类号
学科分类号
摘要
We present in this article a two-timescale variant of Q-learning with linear function approximation. Both Q-values and policies are assumed to be parameterized with the policy parameter updated on a faster timescale as compared to the Q-value parameter. This timescale separation is seen to result in significantly improved numerical performance of the proposed algorithm over Q-learning. We show that the proposed algorithm converges almost surely to a closed connected internally chain transitive invariant set of an associated differential inclusion.
引用
收藏
页码:477 / 509
页数:32
相关论文
共 50 条
  • [1] Multiscale Q-learning with linear function approximation
    Bhatnagar, Shalabh
    Lakshmanan, K.
    DISCRETE EVENT DYNAMIC SYSTEMS-THEORY AND APPLICATIONS, 2016, 26 (03): : 477 - 509
  • [2] Q-Learning with linear function approximation
    Melo, Francisco S.
    Ribeiro, M. Isabel
    LEARNING THEORY, PROCEEDINGS, 2007, 4539 : 308 - +
  • [3] LinFa-Q: Accurate Q-learning with linear function approximation
    Wang, Zhechao
    Fu, Qiming
    Chen, Jianping
    Liu, Quan
    Lu, You
    Wu, Hongjie
    Hu, Fuyuan
    NEUROCOMPUTING, 2025, 611
  • [4] Zap Q-learning with Nonlinear Function Approximation
    Chen, Shuhang
    Devraj, Adithya M.
    Lu, Fan
    Busic, Ana
    Meyn, Sean P.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [5] Finite-sample Guarantees for Nash Q-learning with Linear Function Approximation
    Cisneros-Velarde, Pedro
    Koyejo, Sanmi
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 424 - 432
  • [6] The impact of data distribution on Q-learning with function approximation
    Santos, Pedro P.
    Carvalho, Diogo S.
    Sardinha, Alberto
    Melo, Francisco S.
    MACHINE LEARNING, 2024, 113 (09) : 6141 - 6163
  • [7] Whittle Index-Based Q-Learning for Wireless Edge Caching With Linear Function Approximation
    Xiong, Guojun
    Wang, Shufan
    Li, Jian
    Singh, Rahul
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2024, 32 (05) : 4286 - 4301
  • [8] Virtual Machine Placement Via Q-Learning with Function Approximation
    Duong, Thai
    Chu, Yu-Jung
    Thinh Nguyen
    Chakareski, Jacob
    2015 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2015,
  • [9] Linear Approximation based Q-Learning for Edge Caching in Massive MIMO Networks
    Garg, Navneet
    Sellathurai, Mathini
    Ratnarajah, Tharmalingam
    CONFERENCE RECORD OF THE 2019 FIFTY-THIRD ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2019, : 1769 - 1773
  • [10] Gaussian approximation for bias reduction in Q-learning
    D’Eramo, Carlo
    Cini, Andrea
    Nuara, Alessandro
    Pirotta, Matteo
    Alippi, Cesare
    Peters, Jan
    Restelli, Marcello
    Journal of Machine Learning Research, 2021, 22