Multiscale Q-learning with linear function approximation

被引：0

作者：

Shalabh Bhatnagar

K. Lakshmanan

机构：

[1] Indian Institute of Science,Department of Computer Science and Automation

[2] National University of Singapore,Department of Mechanical Engineering

来源：

Discrete Event Dynamic Systems | 2016年 / 26卷

关键词：

Q-learning with linear function approximation; Reinforcement learning; Stochastic approximation; Ordinary differential equation; Differential inclusion; Multi-stage Stochastic shortest path problem;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

We present in this article a two-timescale variant of Q-learning with linear function approximation. Both Q-values and policies are assumed to be parameterized with the policy parameter updated on a faster timescale as compared to the Q-value parameter. This timescale separation is seen to result in significantly improved numerical performance of the proposed algorithm over Q-learning. We show that the proposed algorithm converges almost surely to a closed connected internally chain transitive invariant set of an associated differential inclusion.

引用

页码：477 / 509

页数：32

共 50 条

[1] Multiscale Q-learning with linear function approximation
Bhatnagar, Shalabh
Lakshmanan, K.
DISCRETE EVENT DYNAMIC SYSTEMS-THEORY AND APPLICATIONS, 2016, 26 (03): : 477 - 509
[2] Q-Learning with linear function approximation
Melo, Francisco S.
Ribeiro, M. Isabel
LEARNING THEORY, PROCEEDINGS, 2007, 4539 : 308 - +
[3] LinFa-Q: Accurate Q-learning with linear function approximation
Wang, Zhechao
Fu, Qiming
Chen, Jianping
Liu, Quan
Lu, You
Wu, Hongjie
Hu, Fuyuan
NEUROCOMPUTING, 2025, 611
[4] Zap Q-learning with Nonlinear Function Approximation
Chen, Shuhang
Devraj, Adithya M.
Lu, Fan
Busic, Ana
Meyn, Sean P.
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[5] Finite-sample Guarantees for Nash Q-learning with Linear Function Approximation
Cisneros-Velarde, Pedro
Koyejo, Sanmi
UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 424 - 432
[6] The impact of data distribution on Q-learning with function approximation
Santos, Pedro P.
Carvalho, Diogo S.
Sardinha, Alberto
Melo, Francisco S.
MACHINE LEARNING, 2024, 113 (09) : 6141 - 6163
[7] Whittle Index-Based Q-Learning for Wireless Edge Caching With Linear Function Approximation
Xiong, Guojun
Wang, Shufan
Li, Jian
Singh, Rahul
IEEE-ACM TRANSACTIONS ON NETWORKING, 2024, 32 (05) : 4286 - 4301
[8] Virtual Machine Placement Via Q-Learning with Function Approximation
Duong, Thai
Chu, Yu-Jung
Thinh Nguyen
Chakareski, Jacob
2015 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2015,
[9] Linear Approximation based Q-Learning for Edge Caching in Massive MIMO Networks
Garg, Navneet
Sellathurai, Mathini
Ratnarajah, Tharmalingam
CONFERENCE RECORD OF THE 2019 FIFTY-THIRD ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2019, : 1769 - 1773
[10] Gaussian approximation for bias reduction in Q-learning
D’Eramo, Carlo
Cini, Andrea
Nuara, Alessandro
Pirotta, Matteo
Alippi, Cesare
Peters, Jan
Restelli, Marcello
Journal of Machine Learning Research, 2021, 22

← 1 2 3 4 5 →