A posteriori control densities: Imitation learning from partial observations

被引:0
|
作者
Lefebvre, Tom [1 ]
Crevecoeur, Guillaume [1 ]
机构
[1] Univ Ghent, Dept Electromech Syst & Met Engn, Fac Engn & Architecture, Dynam Design Lab D2LAB, Technol Pk 131, B-9052 Zwijnaarde, Belgium
关键词
Information; -theory; Hidden markov models; Bayesian methods; Imitation learning; Markov decision processes;
D O I
10.1016/j.patrec.2023.04.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper treats a special case of the Imitation from Observations (IfO) problem. IfO is a generalisation of Imitation Learning from state-only demonstrations. Our treatment of IfO considers the case of feature -only demonstrations. This means that the full state is inaccessible for inference, and imitation must occur on the basis of a limited set of features. We refer to this setting as Imitation from Partial Observations (IfPO). This scenario has the advantage of allowing to address a wider variety of demonstrations, as well as solving the problem of heteromorphic student and teacher. We set out for policy learning methods that extract an executable state-feedback policy, directly from those features, which in the literature is known as Behavioural Cloning. In this theoretical work, we formalize the rational inference model of the student decision maker, devoted to imitation, as a controlled Hidden Markov Model. The IfPO problem is then reformulated as a Maximum Likelihood Estimation problem and treated using Expectation-Maximization. We name the resulting fixed point iterations A Posteriori Control Densities. We compare the presented approach to existing methods in the field and identify potential directions for further development, such as an extension to unknown transition and emission models. (c) 2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license ( http://creativecommons.org/licenses/by-nc-nd/4.0/ )
引用
收藏
页码:87 / 94
页数:8
相关论文
共 50 条
  • [1] Sequential robot imitation learning from observations
    Tanwani, Ajay Kumar
    Yan, Andy
    Lee, Jonathan
    Calinon, Sylvain
    Goldberg, Ken
    [J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2021, 40 (10-11): : 1306 - 1325
  • [2] Learning from Partial Observations
    Michael, Loizos
    [J]. 20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 968 - 974
  • [3] Off-Policy Imitation Learning from Observations
    Zhu, Zhuangdi
    Lin, Kaixiang
    Dai, Bo
    Zhou, Jiayu
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [4] To Follow or not to Follow: Selective Imitation Learning from Observations
    Lee, Youngwoon
    Hu, Edward S.
    Yang, Zhengyu
    Lim, Joseph J.
    [J]. CONFERENCE ON ROBOT LEARNING, VOL 100, 2019, 100
  • [5] Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement
    Yang, Chao
    Ma, Xiaojian
    Huang, Wenbing
    Sun, Fuchun
    Liu, Huaping
    Huang, Junzhou
    Gan, Chuang
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [6] Probabilistic model of whole-body motion imitation from partial observations
    Lee, DH
    Nakamura, Y
    [J]. 2005 12th International Conference on Advanced Robotics, 2005, : 337 - 343
  • [7] Sensing Jamming Strategy From Limited Observations: An Imitation Learning Perspective
    Fan, Youlin
    Jiu, Bo
    Pu, Wenqiang
    Li, Ziniu
    Li, Kang
    Liu, Hongwei
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2024, 72 : 4098 - 4114
  • [8] The Logic of AGM Learning from Partial Observations
    Baltag, Alexandru
    Ozgun, Aybuke
    Vargas-Sandoval, Ana Lucia
    [J]. DYNAMIC LOGIC: NEW TRENDS AND APPLICATIONS, DALI 2019, 2020, 12005 : 35 - 52
  • [9] Learning Valuation Distributions from Partial Observations
    Blum, Avrim
    Mansour, Yishay
    Morgenstern, Jamie
    [J]. PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 798 - 804
  • [10] Learning to Play Robot Soccer from Partial Observations
    Szemenyei, Marton
    Reizinger, Patrik
    [J]. 2020 23RD IEEE INTERNATIONAL SYMPOSIUM ON MEASUREMENT AND CONTROL IN ROBOTICS (ISMCR), 2020,