Expert Initialized Hybrid Model-Based and Model-Free Reinforcement Learning

被引:0
|
作者
Langaa, Jeppe [1 ]
Sloth, Christoffer [1 ]
机构
[1] Univ Southern Denmark, Maersk McKinney Moller Inst, Odense, Denmark
关键词
D O I
10.23919/ECC57647.2023.10178306
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents a reinforcement learning algorithm that enables fast learning of control policies based on a limited amount of training data, by leveraging the attributes of both model-based and model-free algorithms. This is accomplished by using expert demonstrations for initializing the reinforcement learning algorithm, by learning a Gaussian process model and a policy that behaves similar to the expert. The policy is subsequently improved using Bi-poplation Covariance Matrix Adaptation Evolution Strategy (BIPOP-CMA-ES) that exploits the model in a black-box optimizer. Finally, the policy parameters obtained from BIPOP-CMA-ES are refined by a model-free reinforcement learning algorithm. Scalable Variational Gaussian Processes are used in the model to allow high-dimensional state spaces and larger amounts of data; in addition, autoencoders are used for dimensionality reduction of the parameter space in BIPOP-CMA-ES. The algorithm is tested in a cart-pole system as well in a higher-dimensional industrial peg-in-hole task and is compared to state-of-the-art model-free and model-based algorithms. The proposed algorithm solves the peg-in-hole task faster than previous algorithms.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Hybrid control for combining model-based and model-free reinforcement learning
    Pinosky, Allison
    Abraham, Ian
    Broad, Alexander
    Argall, Brenna
    Murphey, Todd D.
    [J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2023, 42 (06): : 337 - 355
  • [2] Model-based and Model-free Reinforcement Learning for Visual Servoing
    Farahmand, Amir Massoud
    Shademan, Azad
    Jagersand, Martin
    Szepesvari, Csaba
    [J]. ICRA: 2009 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1-7, 2009, : 4135 - 4142
  • [3] Model-Based and Model-Free Replay Mechanisms for Reinforcement Learning in Neurorobotics
    Massi, Elisa
    Barthelemy, Jeanne
    Mailly, Juliane
    Dromnelle, Remi
    Canitrot, Julien
    Poniatowski, Esther
    Girard, Benoit
    Khamassi, Mehdi
    [J]. FRONTIERS IN NEUROROBOTICS, 2022, 16
  • [4] Comparing Model-free and Model-based Algorithms for Offline Reinforcement Learning
    Swazinna, Phillip
    Udluft, Steffen
    Hein, Daniel
    Runkler, Thomas
    [J]. IFAC PAPERSONLINE, 2022, 55 (15): : 19 - 26
  • [5] Sliding mode heading control for AUV based on continuous hybrid model-free and model-based reinforcement learning
    Wang, Dianrui
    Shen, Yue
    Wan, Junhe
    Sha, Qixin
    Li, Guangliang
    Chen, Guanzhong
    He, Bo
    [J]. APPLIED OCEAN RESEARCH, 2022, 118
  • [6] Parallel model-based and model-free reinforcement learning for card sorting performance
    Steinke, Alexander
    Lange, Florian
    Kopp, Bruno
    [J]. SCIENTIFIC REPORTS, 2020, 10 (01)
  • [7] EEG-based classification of learning strategies : model-based and model-free reinforcement learning
    Kim, Dongjae
    Weston, Charles
    Lee, Sang Wan
    [J]. 2018 6TH INTERNATIONAL CONFERENCE ON BRAIN-COMPUTER INTERFACE (BCI), 2018, : 146 - 148
  • [8] Successor Features Combine Elements of Model-Free and Model-based Reinforcement Learning
    Lehnert, Lucas
    Littman, Michael L.
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
  • [9] Variability in Dopamine Genes Dissociates Model-Based and Model-Free Reinforcement Learning
    Doll, Bradley B.
    Bath, Kevin G.
    Daw, Nathaniel D.
    Frank, Michael J.
    [J]. JOURNAL OF NEUROSCIENCE, 2016, 36 (04): : 1211 - 1222
  • [10] Successor features combine elements of model-free and model-based reinforcement learning
    Lehnert, Lucas
    Littman, Michael L.
    [J]. Journal of Machine Learning Research, 2020, 21