Improving Optimistic Exploration in Model-Free Reinforcement Learning

被引:0
|
作者
Grzes, Marek [1 ]
Kudenko, Daniel [1 ]
机构
[1] Univ York, Dept Comp Sci, York YO10 5DD, N Yorkshire, England
来源
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The key problem in reinforcement learning is the exploration-exploitation tradeoff. An optimistic initialisation of the value function is a. popular RI strategy. The problem of this approach is that the algorithm may have relatively low performance after many episodes of learning. In this paper, two extensions to standard optimistic exploration are proposed. The first one is based on different initialisation of the value function of goal states. The second one which builds on the previous idea explicitly separates propagation of low and high values in the state space. Proposed extensions show improvement in empirical comparisons with basic optimistic initialisation. Additionally, they improve anytime performance and help on domains where learning takes place on the sub-space of the large state space, that is, where the standard optimistic approach faces more difficulties.
引用
收藏
页码:360 / 369
页数:10
相关论文
共 50 条
  • [1] Model-Free Active Exploration in Reinforcement Learning
    Russo, Alessio
    Proutiere, Alexandre
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [2] Improving Sample Efficiency in Model-Free Reinforcement Learning from Images
    Yarats, Denis
    Zhang, Amy
    Kostrikov, Ilya
    Amos, Brandon
    Pineau, Joelle
    Fergus, Rob
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 10674 - 10681
  • [3] Bayesian Optimistic Optimization: Optimistic Exploration for Model-based Reinforcement Learning
    Wu, Chenyang
    Li, Tianci
    Zhang, Zongzhang
    Yu, Yang
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [4] Optimistic Exploration in Reinforcement Learning Using Symbolic Model Estimates
    Sreedharan, Sarath
    Katz, Michael
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [5] On Improving Model-Free Algorithms for Decentralized Multi-Agent Reinforcement Learning
    Mao, Weichao
    Yang, Lin F.
    Zhang, Kaiqing
    Basar, Tamer
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [6] Learning Representations in Model-Free Hierarchical Reinforcement Learning
    Rafati, Jacob
    Noelle, David C.
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 10009 - 10010
  • [7] Model-Free Trajectory Optimization for Reinforcement Learning
    Akrour, Riad
    Abdolmaleki, Abbas
    Abdulsamad, Hany
    Neumann, Gerhard
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [8] Model-Free Quantum Control with Reinforcement Learning
    Sivak, V. V.
    Eickbusch, A.
    Liu, H.
    Royer, B.
    Tsioutsios, I
    Devoret, M. H.
    [J]. PHYSICAL REVIEW X, 2022, 12 (01):
  • [9] Online Nonstochastic Model-Free Reinforcement Learning
    Ghai, Udaya
    Gupta, Arushi
    Xia, Wenhan
    Singh, Karan
    Hazan, Elad
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [10] Model-Free Reinforcement Learning Algorithms: A Survey
    Calisir, Sinan
    Pehlivanoglu, Meltem Kurt
    [J]. 2019 27TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2019,