Kernel-Based Reinforcement Learning

被引:1
|
作者
Dirk Ormoneit
Śaunak Sen
机构
[1] Stanford University,Department of Computer Science
[2] The Jackson Laboratory,undefined
来源
Machine Learning | 2002年 / 49卷
关键词
reinforcement learning; Markov decision process; kernel-based learning; kernel smoothing; local averaging; lazy learning;
D O I
暂无
中图分类号
学科分类号
摘要
We present a kernel-based approach to reinforcement learning that overcomes the stability problems of temporal-difference learning in continuous state-spaces. First, our algorithm converges to a unique solution of an approximate Bellman's equation regardless of its initialization values. Second, the method is consistent in the sense that the resulting policy converges asymptotically to the optimal policy. Parametric value function estimates such as neural networks do not possess this property. Our kernel-based approach also allows us to show that the limiting distribution of the value function estimate is a Gaussian process. This information is useful in studying the bias-variance tradeoff in reinforcement learning. We find that all reinforcement learning approaches to estimating the value function, parametric or non-parametric, are subject to a bias. This bias is typically larger in reinforcement learning than in a comparable regression problem.
引用
收藏
页码:161 / 178
页数:17
相关论文
共 50 条
  • [1] Kernel-Based Reinforcement Learning
    Hu, Guanghua
    Qiu, Yuqin
    Xiang, Liming
    [J]. INTELLIGENT COMPUTING, PART I: INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING, ICIC 2006, PART I, 2006, 4113 : 757 - 766
  • [2] Kernel-based reinforcement learning
    Ormoneit, D
    Sen, S
    [J]. MACHINE LEARNING, 2002, 49 (2-3) : 161 - 178
  • [3] Practical Kernel-Based Reinforcement Learning
    Barreto, Andre M. S.
    Precup, Doina
    Pineau, Joelle
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2016, 17
  • [4] KERNEL-BASED LIFELONG POLICY GRADIENT REINFORCEMENT LEARNING
    Mowakeaa, Rami
    Kim, Seung-Jun
    Emge, Darren K.
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3500 - 3504
  • [5] Kernel-Based Decentralized Policy Evaluation for Reinforcement Learning
    Liu, Jiamin
    Lian, Heng
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
  • [6] Kernel-Based Reinforcement Learning: A Finite-Time Analysis
    Domingues, Omar D.
    Menard, Pierre
    Pirotta, Matteo
    Kaufmann, Emilie
    Valko, Michal
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [7] Kernel-Based Reinforcement Learning in Robust Markov Decision Processes
    Lim, Shiau Hong
    Autef, Arnaud
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [8] Kernel-based least squares policy iteration for reinforcement learning
    Xu, Xin
    Hu, Dewen
    Lu, Xicheng
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 2007, 18 (04): : 973 - 992
  • [9] Kernel-based reinforcement learning in average-cost problems
    Ormoneit, D
    Glynn, P
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2002, 47 (10) : 1624 - 1636
  • [10] The Characteristics of Kernel and Kernel-based Learning
    Tan, Fuxiao
    Han, Dezhi
    [J]. 2019 3RD INTERNATIONAL SYMPOSIUM ON AUTONOMOUS SYSTEMS (ISAS 2019), 2019, : 406 - 411