Robust Reinforcement Learning with Bayesian Optimisation and Quadrature

被引:0
|
作者
Paul, Supratik [1 ]
Chatzilygeroudis, Konstantinos [2 ,3 ,4 ]
Ciosek, Kamil [1 ]
Mouret, Jean-Baptiste [2 ,3 ,4 ]
Osborne, Michael A. [5 ]
Whiteson, Shimon [1 ]
机构
[1] Univ Oxford, Dept Comp Sci, Wolfson Bldg,Parks Rd, Oxford OX1 3QD, England
[2] INRIA, Paris, France
[3] Univ Lorraine, Nancy, France
[4] CNRS, Paris, France
[5] Univ Oxford, Dept Engn Sci, Oxford, England
基金
欧洲研究理事会;
关键词
Reinforcement Learning; Bayesian Optimisation; Bayesian Quadrature; Significant rare events; Environment variables;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Bayesian optimisation has been successfully applied to a variety of reinforcement learning problems. However, the traditional approach for learning optimal policies in simulators does not utilise the opportunity to improve learning by adjusting certain environment variables: state features that are unobservable and randomly determined by the environment in a physical setting but are controllable in a simulator. This article considers the problem of finding a robust policy while taking into account the impact of environment variables. We present alternating optimisation and quadrature (ALOQ), which uses Bayesian optimisation and Bayesian quadrature to address such settings. We also present transferable ALOQ (TALOQ), for settings where simulator inaccuracies lead to difficulty in transferring the learnt policy to the physical system. We show that our algorithms are robust to the presence of significant rare events, which may not be observable under random sampling but play a substantial role in determining the optimal policy. Experimental results across different domains show that our algorithms learn robust policies efficiently.
引用
收藏
页数:31
相关论文
共 50 条
  • [31] TRAINABLE, BAYESIAN SYMMETRIES FOR REINFORCEMENT LEARNING
    Lu, Qingmei
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER THEORY AND ENGINEERING (ICACTE 2009), VOLS 1 AND 2, 2009, : 1079 - 1086
  • [32] Cover tree bayesian reinforcement learning
    Tziortziotis, Nikolaos
    Dimitrakakis, Christos
    Blekas, Konstantinos
    Journal of Machine Learning Research, 2014, 15 : 2313 - 2335
  • [33] Cover Tree Bayesian Reinforcement Learning
    Tziortziotis, Nikolaos
    Dimitrakakis, Christos
    Blekas, Konstantinos
    JOURNAL OF MACHINE LEARNING RESEARCH, 2014, 15 : 2313 - 2335
  • [34] Sellers' Pricing By Bayesian Reinforcement Learning
    Han, Wei
    2009 INTERNATIONAL CONFERENCE ON E-BUSINESS AND INFORMATION SYSTEM SECURITY, VOLS 1 AND 2, 2009, : 1276 - 1280
  • [35] Hierarchical Bayesian Inverse Reinforcement Learning
    Choi, Jaedeug
    Kim, Kee-Eung
    IEEE TRANSACTIONS ON CYBERNETICS, 2015, 45 (04) : 793 - 805
  • [36] Bayesian reinforcement learning reliability analysis
    Zhou, Tong
    Guo, Tong
    Dang, Chao
    Beer, Michael
    COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2024, 424
  • [37] Bayesian reinforcement learning reliability analysis
    Zhou, Tong
    Guo, Tong
    Dang, Chao
    Beer, Michael
    Computer Methods in Applied Mechanics and Engineering, 2024, 424
  • [38] Multi-Objective Optimisation by Reinforcement Learning
    Liao, H. L.
    Wu, Q. H.
    2010 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2010,
  • [39] Reinforcement learning for process identification, control and optimisation
    Govindhasamy, JJ
    McLoone, SF
    Irwin, GW
    2004 2ND INTERNATIONAL IEEE CONFERENCE INTELLIGENT SYSTEMS, VOLS 1 AND 2, PROCEEDINGS, 2004, : 316 - 321
  • [40] Robust Active Simultaneous Localization and Mapping Based on Bayesian Actor-Critic Reinforcement Learning
    Pedraza, Bryan
    Dera, Dimah
    2023 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI, 2023, : 63 - 66