Truly Batch Model-Free Inverse Reinforcement Learning about Multiple Intentions

被引:0
|
作者
Ramponi, Giorgia [1 ]
Likmeta, Amarildo [1 ,2 ]
Metelli, Alberto Maria [1 ]
Tirinzoni, Andrea [1 ]
Restelli, Marcello [1 ]
机构
[1] Politecn Milan, Milan, Italy
[2] Univ Bologna, Bologna, Italy
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider Inverse Reinforcement Learning (IRL) about multiple intentions, i.e., the problem of estimating the unknown reward functions optimized by a group of experts that demonstrate optimal behaviors. Most of the existing algorithms either require access to a model of the environment or need to repeatedly compute the optimal policies for the hypothesized rewards. However, these requirements are rarely met in real-world applications, in which interacting with the environment can be expensive or even dangerous. In this paper, we address the IRL about multiple intentions in a fully model-free and batch setting. We first cast the single IRL problem as a constrained likelihood maximization and then we use this formulation to cluster agents based on the likelihood of the assignment. In this way, we can efficiently solve, without interactions with the environment, both the IRL and the clustering problem. Finally, we evaluate the proposed methodology on simulated domains and on a real-world social-network application.
引用
下载
收藏
页码:2359 / 2368
页数:10
相关论文
共 50 条
  • [11] Recovering Robustness in Model-Free Reinforcement Learning
    Venkataraman, Harish K.
    Seiler, Peter J.
    2019 AMERICAN CONTROL CONFERENCE (ACC), 2019, : 4210 - 4216
  • [12] Model-free attitude synchronization for multiple heterogeneous quadrotors via reinforcement learning
    Zhao, Wanbing
    Liu, Hao
    Wang, Bohui
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2021, 36 (06) : 2528 - 2547
  • [13] Correction to: Model-free inverse reinforcement learning with multi-intention, unlabeled, and overlapping demonstrations
    Ariyan Bighashdel
    Pavol Jancura
    Gijs Dubbelman
    Machine Learning, 2023, 112 : 429 - 430
  • [14] Policy Learning with Constraints in Model-free Reinforcement Learning: A Survey
    Liu, Yongshuai
    Halev, Avishai
    Liu, Xin
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 4508 - 4515
  • [15] Improving Optimistic Exploration in Model-Free Reinforcement Learning
    Grzes, Marek
    Kudenko, Daniel
    ADAPTIVE AND NATURAL COMPUTING ALGORITHMS, 2009, 5495 : 360 - 369
  • [16] Model-Free Preference-Based Reinforcement Learning
    Wirth, Christian
    Fuernkranz, Johannes
    Neumann, Gerhard
    THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 2222 - 2228
  • [17] Constrained model-free reinforcement learning for process optimization
    Pan, Elton
    Petsagkourakis, Panagiotis
    Mowbray, Max
    Zhang, Dongda
    del Rio-Chanona, Ehecatl Antonio
    COMPUTERS & CHEMICAL ENGINEERING, 2021, 154
  • [18] Model-Free μ Synthesis via Adversarial Reinforcement Learning
    Keivan, Darioush
    Havens, Aaron
    Seiler, Peter
    Dullerud, Geir
    Hu, Bin
    2022 AMERICAN CONTROL CONFERENCE, ACC, 2022, : 3335 - 3341
  • [19] An adaptive clustering method for model-free reinforcement learning
    Matt, A
    Regensburger, G
    INMIC 2004: 8TH INTERNATIONAL MULTITOPIC CONFERENCE, PROCEEDINGS, 2004, : 362 - 367
  • [20] Model-Free Reinforcement Learning for Mean Field Games
    Mishra, Rajesh
    Vasal, Deepanshu
    Vishwanath, Sriram
    IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2023, 10 (04): : 2141 - 2151