Leveraging Observational Learning for Exploration in Bandits Extended Abstract

被引:0
|
作者
Lupu, Andrei [1 ]
Durand, Audrey [1 ]
Precup, Doina [1 ]
机构
[1] McGill Univ, Montreal, PQ, Canada
关键词
Observational learning; imitation learning; bandits;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
引用
收藏
页码:2001 / 2003
页数:3
相关论文
共 50 条
  • [1] Bandits with Knapsacks (Extended Abstract)
    Badanidiyuru, Ashwinkumar
    Kleinberg, Robert
    Slivkins, Aleksandrs
    [J]. 2013 IEEE 54TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS), 2013, : 207 - 216
  • [2] Models for Autonomously Motivated Exploration in Reinforcement Learning (Extended Abstract)
    Auer, Peter
    Lim, Shiau Hong
    Watkins, Chris
    [J]. ALGORITHMIC LEARNING THEORY, 2011, 6925 : 14 - +
  • [3] Guiding Reinforcement Learning Exploration Using Natural Language Extended Abstract
    Harrison, Brent
    Ehsan, Upol
    Riedl, Mark O.
    [J]. PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS (AAMAS' 18), 2018, : 1956 - 1958
  • [4] Improved Learning Complexity in Combinatorial Pure Exploration Bandits
    Gabillon, Victor
    Lazaric, Alessandro
    Ghavamzadeh, Mohammad
    Ortner, Ronald
    Bartlett, Peter
    [J]. ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 51, 2016, 51 : 1004 - 1012
  • [5] Leveraging Currency for Repairing Inconsistent and Incomplete Data (Extended Abstract)
    Ding, Xiaoou
    Wang, Hongzhi
    Su, Jiaxuan
    Wang, Muxian
    Li, Jianzhong
    Gao, Hong
    [J]. 2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021), 2021, : 2315 - 2316
  • [6] On the Observational Theory of the CPS-calculus (Extended Abstract)
    Merro, Massimo
    Biasi, Corrado
    [J]. ELECTRONIC NOTES IN THEORETICAL COMPUTER SCIENCE, 2006, 158 : 307 - 330
  • [7] Learning from Failure [Extended Abstract]
    Grollman, Daniel H.
    Billard, Aude G.
    [J]. PROCEEDINGS OF THE 6TH ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTIONS (HRI 2011), 2011, : 145 - 146
  • [8] Meta-Learning Effective Exploration Strategies for Contextual Bandits
    Sharaf, Amr
    Daume, Hal, III
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 9541 - 9548
  • [9] Contextual Bandits with Delayed Feedback and Semi-supervised Learning (Student Abstract)
    Yang, Luting
    Yang, Jianyi
    Ren, Shaolei
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 15943 - 15944
  • [10] Deep Residual Reinforcement Learning (Extended Abstract)
    Zhang, Shangtong
    Boehmer, Wendelin
    Whiteson, Shimon
    [J]. PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 4869 - 4873