Multiscale Q-learning with linear function approximation

被引：0

作者：

Shalabh Bhatnagar

K. Lakshmanan

机构：

[1] Indian Institute of Science,Department of Computer Science and Automation

[2] National University of Singapore,Department of Mechanical Engineering

来源：

Discrete Event Dynamic Systems | 2016年 / 26卷

关键词：

Q-learning with linear function approximation; Reinforcement learning; Stochastic approximation; Ordinary differential equation; Differential inclusion; Multi-stage Stochastic shortest path problem;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

We present in this article a two-timescale variant of Q-learning with linear function approximation. Both Q-values and policies are assumed to be parameterized with the policy parameter updated on a faster timescale as compared to the Q-value parameter. This timescale separation is seen to result in significantly improved numerical performance of the proposed algorithm over Q-learning. We show that the proposed algorithm converges almost surely to a closed connected internally chain transitive invariant set of an associated differential inclusion.

引用

页码：477 / 509

页数：32

共 50 条

[41] Neural Q-learning
Stephan ten Hagen
Ben Kröse
Neural Computing & Applications, 2003, 12 : 81 - 88
[42] Robust Q-Learning
Ertefaie, Ashkan
McKay, James R.
Oslin, David
Strawderman, Robert L.
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2021, 116 (533) : 368 - 381
[43] On Synchronous Binary Log-Linear Learning and Second Order Q-learning
Hasanbeig, Mohammadhosein
Pavel, Lacra
IFAC PAPERSONLINE, 2017, 50 (01): : 8987 - 8992
[44] Neural Q-learning
ten Hagen, S
Kröse, B
NEURAL COMPUTING & APPLICATIONS, 2003, 12 (02): : 81 - 88
[45] Logistic Q-Learning
Bas-Serrano, Joan
Curi, Sebastian
Krause, Andreas
Neu, Gergely
24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
[46] Enhancing Nash Q-learning and Team Q-learning mechanisms by using bottlenecks
Ghazanfari, Behzad
Mozayani, Nasser
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2014, 26 (06) : 2771 - 2783
[47] Comparison of Deep Q-Learning, Q-Learning and SARSA Reinforced Learning for Robot Local Navigation
Anas, Hafiq
Ong, Wee Hong
Malik, Owais Ahmed
ROBOT INTELLIGENCE TECHNOLOGY AND APPLICATIONS 6, 2022, 429 : 443 - 454
[48] Learning to Play Pac-Xon with Q-Learning and Two Double Q-Learning Variants
Schilperoort, Jits
Mak, Ivar
Drugan, Madalina M.
Wiering, Marco A.
2018 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI), 2018, : 1151 - 1158
[49] Efficient non-linear control by combining Q-learning with local linear controllers
Kimura, H
Kobayashi, S
MACHINE LEARNING, PROCEEDINGS, 1999, : 210 - 219
[50] Improved Q-Learning Method for Linear Discrete-Time Systems
Chen, Jian
Wang, Jinhua
Huang, Jie
PROCESSES, 2020, 8 (03)

← 1 2 3 4 5 →