Kernel-Based Reinforcement Learning in Robust Markov Decision Processes

被引：0

作者：

Lim, Shiau Hong ^{[1
]}

Autef, Arnaud ^{[2
]}

机构：

[1] IBM Res, Singapore, Singapore

[2] Ecole Polytech, Appl Math Dept, Paris, France

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97 | 2019年 / 97卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The robust Markov Decision Process (MDP) framework aims to address the problem of parameter uncertainty due to model mismatch, approximation errors or even adversarial behaviors. It is especially relevant when deploying the learned policies in real-world applications. Scaling up the robust MDP framework to large or continuous state space remains a challenging problem. The use of function approximation in this case is usually inevitable and this can only amplify the problem of model mismatch and parameter uncertainties. It has been previously shown that, in the case of MDPs with state aggregation, the robust policies enjoy a tighter performance bound compared to standard solutions due to its reduced sensitivity to approximation errors. We extend these results to the much larger class of kernel-based approximators and show, both analytically and empirically that the robust policies can significantly outperform the non-robust counterpart.

引用

页数：9

共 50 条

[21] On the convergence of projective-simulation-based reinforcement learning in Markov decision processes
Boyajian, W. L.
Clausen, J.
Trenkwalder, L. M.
Dunjko, V
Briegel, H. J.
[J]. QUANTUM MACHINE INTELLIGENCE, 2020, 2 (02)
[22] On the convergence of projective-simulation–based reinforcement learning in Markov decision processes
W. L. Boyajian
J. Clausen
L. M. Trenkwalder
V. Dunjko
H. J. Briegel
[J]. Quantum Machine Intelligence, 2020, 2
[23] Robust kernel-based learning for image-related problems
Liao, C-T.
Lai, S-H.
[J]. IET IMAGE PROCESSING, 2012, 6 (06) : 795 - 803
[24] Learning Kernel-Based Robust Disturbance Dictionary for Face Recognition
Ding, Biwei
Ji, Hua
[J]. APPLIED SCIENCES-BASEL, 2019, 9 (06):
[25] REINFORCEMENT LEARNING OF NON-MARKOV DECISION-PROCESSES
WHITEHEAD, SD
LIN, LJ
[J]. ARTIFICIAL INTELLIGENCE, 1995, 73 (1-2) : 271 - 306
[26] From Perturbation Analysis to Markov Decision Processes and Reinforcement Learning
Xi-Ren Cao
[J]. Discrete Event Dynamic Systems, 2003, 13 : 9 - 39
[27] Reinforcement Learning for Cost-Aware Markov Decision Processes
Suttle, Wesley A.
Zhang, Kaiqing
Yang, Zhuoran
Kraemer, David N.
Liu, Ji
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[28] From perturbation analysis to Markov decision processes and reinforcement learning
Cao, XR
[J]. DISCRETE EVENT DYNAMIC SYSTEMS-THEORY AND APPLICATIONS, 2003, 13 (1-2): : 9 - 39
[29] Kernel-Based Markov Random Fields Learning for Wireless Sensor Networks
Zhao, Wei
Liang, Yao
[J]. 2011 IEEE 36TH CONFERENCE ON LOCAL COMPUTER NETWORKS (LCN), 2011, : 155 - 158
[30] Kernel-Based Copula Processes
Jaimungal, Sebastian
Ng, Eddie K. H.
[J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I, 2009, 5781 : 628 - +

← 1 2 3 4 5 →