Near-optimal regret bounds for reinforcement learning

被引:0
|
作者
Jaksch, Thomas [1 ]
Ortner, Ronald [1 ]
Auer, Peter [1 ]
机构
[1] Department of Information Technology, University of Leoben, Franz-Josef-Strasse 18, Leoben 8700, Austria
关键词
D O I
暂无
中图分类号
学科分类号
摘要
引用
收藏
页码:1563 / 1600
相关论文
共 50 条
  • [1] Near-optimal Regret Bounds for Reinforcement Learning
    Jaksch, Thomas
    Ortner, Ronald
    Auer, Peter
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2010, 11 : 1563 - 1600
  • [2] Near-Optimal Regret Bounds for Multi-batch Reinforcement Learning
    Zhang, Zihan
    Jiang, Yuhang
    Zhou, Yuan
    Ji, Xiangyang
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [3] Near-Optimal Regret Bounds for Thompson Sampling
    Agrawal, Shipra
    Goyal, Navin
    [J]. JOURNAL OF THE ACM, 2017, 64 (05)
  • [4] Model-Free Nonstationary Reinforcement Learning: Near-Optimal Regret and Applications in Multiagent Reinforcement Learning and Inventory Control
    Mao, Weichao
    Zhang, Kaiqing
    Zhu, Ruihao
    Simchi-Levi, David
    Basar, Tamer
    [J]. MANAGEMENT SCIENCE, 2024,
  • [5] Near-Optimal No-Regret Learning in General Games
    Daskalakis, Constantinos
    Fishelson, Maxwell
    Golowich, Noah
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [6] Kernelized Reinforcement Learning with Order Optimal Regret Bounds
    Vakili, Sattar
    Olkhovskaya, Julia
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [7] Collaborative Linear Bandits with Adversarial Agents: Near-Optimal Regret Bounds
    Mitra, Aritra
    Adibi, Arman
    Pappas, George J.
    Hassani, Hamed
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [8] Risk-Sensitive Reinforcement Learning: Near-Optimal Risk-Sample Tradeoff in Regret
    Fei, Yingjie
    Yang, Zhuoran
    Chen, Yudong
    Wang, Zhaoran
    Xie, Qiaomin
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [9] Near-Optimal Reinforcement Learning in Polynomial Time
    Michael Kearns
    Satinder Singh
    [J]. Machine Learning, 2002, 49 : 209 - 232
  • [10] Near-optimal reinforcement learning in polynomial time
    Kearns, M
    Singh, S
    [J]. MACHINE LEARNING, 2002, 49 (2-3) : 209 - 232