Bayesian Reinforcement Learning with Exploration

被引：0

作者：

Lattimore, Tor ^{[1
]}

Hutter, Marcus ^{[2
]}

机构：

[1] Univ Alberta, Edmonton, AB T6G 2M7, Canada

[2] Australian Natl Univ, Canberra, ACT 0200, Australia

来源：

ALGORITHMIC LEARNING THEORY (ALT 2014) | 2014年 / 8776卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We consider a general reinforcement learning problem and show that carefully combining the Bayesian optimal policy and an exploring policy leads to minimax sample-complexity bounds in a very general class of (history-based) environments. We also prove lower bounds and show that the new algorithm displays adaptive behaviour when the environment is easier than worst-case.

引用

页码：170 / 184

页数：15

共 50 条

[41] BSE-MAML: Model Agnostic Meta-Reinforcement Learning via Bayesian Structured Exploration
Wang, Haonan
Zhang, Yiyun
Feng, Dawei
Li, Dongsheng
Huang, Feng
2020 IEEE 13TH INTERNATIONAL CONFERENCE ON SERVICES COMPUTING (SCC 2020), 2020, : 60 - 67
[42] ε-BMC: A Bayesian Ensemble Approach to Epsilon-Greedy Exploration in Model-Free Reinforcement Learning
Gimelfarb, Michael
Sanner, Scott
Lee, Chi-Guhn
35TH UNCERTAINTY IN ARTIFICIAL INTELLIGENCE CONFERENCE (UAI 2019), 2020, 115 : 476 - 485
[43] Fast active learning for pure exploration in reinforcement learning
Menard, Pierre
Domingues, Omar Darwiche
Kaufmann, Emilie
Jonsson, Anders
Leurent, Edouard
Valko, Michal
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[44] Learning of deterministic exploration and temporal abstraction in reinforcement learning
Shibata, Katsunari
2006 SICE-ICASE International Joint Conference, Vols 1-13, 2006, : 2212 - 2217
[45] Shaping Bayesian Network Based Reinforcement Learning
Song, Jiong
Jin, Zhao
2012 INTERNATIONAL CONFERENCE ON INDUSTRIAL CONTROL AND ELECTRONICS ENGINEERING (ICICEE), 2012, : 742 - 745
[46] Robust Reinforcement Learning with Bayesian Optimisation and Quadrature
Paul, Supratik
Chatzilygeroudis, Konstantinos
Ciosek, Kamil
Mouret, Jean-Baptiste
Osborne, Michael A.
Whiteson, Shimon
JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
[47] Approximate planning for bayesian hierarchical reinforcement learning
Ngo Anh Vien
Hung Ngo
Lee, Sungyoung
Chung, TaeChoong
APPLIED INTELLIGENCE, 2014, 41 (03) : 808 - 819
[48] A Bayesian Posterior Updating Algorithm in Reinforcement Learning
Xiong, Fangzhou
Liu, Zhiyong
Yang, Xu
Sun, Biao
Chiu, Charles
Qiao, Hong
NEURAL INFORMATION PROCESSING, ICONIP 2017, PT V, 2017, 10638 : 418 - 426
[49] Surveillance Evasion Through Bayesian Reinforcement Learning
Cornell University, United States
arXiv, 1600,
[50] Robust reinforcement learning with bayesian optimisation and quadrature
Paul, Supratik
Chatzilygeroudis, Konstantinos
Ciosek, Kamil
Mouret, Jean-Baptiste
Osborne, Michael A.
Whiteson, Shimon
Journal of Machine Learning Research, 2020, 21

← 1 2 3 4 5 →