Model-Based Imitation Learning Using Entropy Regularization of Model and Policy

被引:1
|
作者
Uchibe, Eiji [1 ]
机构
[1] ATR Computat Neurosci Labs, Dept Brain Robot Interface, Kyoto 6190288, Japan
关键词
Imitation learning; machine learning for robot control; reinforcement learning;
D O I
10.1109/LRA.2022.3196139
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Approaches based on generative adversarial networks for imitation learning are promising because they are sample efficient in terms of expert demonstrations. However, training a generator requires many interactions with the actual environment because model-free reinforcement learning is adopted to update a policy. To improve the sample efficiency using model-based reinforcement learning, we propose model-based Entropy-Regularized Imitation Learning (MB-ERIL) under the entropy-regularized Markov decision process to reduce the number of interactions with the actual environment. MB-ERIL uses two discriminators. A policy discriminator distinguishes the actions generated by a robot from expert ones, and a model discriminator distinguishes the counterfactual state transitions generated by the model from the actual ones. We derive structured discriminators so that the learning of the policy and the model is efficient. Computer simulations and real robot experiments show that MB-ERIL achieves a competitive performance and significantly improves the sample efficiency compared to baseline methods.
引用
收藏
页码:10922 / 10929
页数:8
相关论文
共 50 条
  • [21] Adaptive regularization in image restoration using a model-based neural network
    Wong, HS
    Guan, L
    [J]. OPTICAL ENGINEERING, 1997, 36 (12) : 3297 - 3308
  • [22] Adaptive regularization in image restoration using a model-based neural network
    Wong, HS
    Guan, L
    [J]. APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS IN IMAGE PROCESSING II, 1997, 3030 : 125 - 136
  • [23] Model-Free Imitation Learning with Policy Optimization
    Ho, Jonathan
    Gupta, Jayesh K.
    Ermon, Stefano
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [24] Model-Based Reinforcement Learning via Proximal Policy Optimization
    Sun, Yuewen
    Yuan, Xin
    Liu, Wenzhang
    Sun, Changyin
    [J]. 2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 4736 - 4740
  • [25] Learning to control space robots with flexible appendages using model-based policy search
    Zhang, Qiyuan
    Meng, Deshan
    Wang, Xuectian
    Liang, Bin
    Lu, Weining
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (IEEE ROBIO 2017), 2017, : 1395 - 1400
  • [26] Feature Selection for Gene Expression Using Model-Based Entropy
    Zhu, Shenghuo
    Wang, Dingding
    Yu, Kai
    Li, Tao
    Gong, Yihong
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2010, 7 (01) : 25 - 36
  • [27] Model-Based Robot Imitation with Future Image Similarity
    Wu, A.
    Piergiovanni, A. J.
    Ryoo, M. S.
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2020, 128 (05) : 1360 - 1374
  • [28] Model-Based Robot Imitation with Future Image Similarity
    A. Wu
    A. J. Piergiovanni
    M. S. Ryoo
    [J]. International Journal of Computer Vision, 2020, 128 : 1360 - 1374
  • [29] Model-Based Off-Policy Deep Reinforcement Learning With Model-Embedding
    Tan, Xiaoyu
    Qu, Chao
    Xiong, Junwu
    Zhang, James
    Qiu, Xihe
    Jin, Yaochu
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 8 (04): : 2974 - 2986
  • [30] Combining Model-Based Policy Search with Online Model Learning for Control of Physical Humanoids
    Mordatch, Igor
    Mishra, Nikhil
    Eppner, Clemens
    Abbeel, Pieter
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2016, : 242 - 248