Model-Free Deep Inverse Reinforcement Learning by Logistic Regression

被引:1
|
作者
Eiji Uchibe
机构
[1] ATR Computational Neuroscience Labs.,Department of Brain Robot Interface
[2] Okinawa Institute of Science and Technology Graduate University,Neural Computation Unit
来源
Neural Processing Letters | 2018年 / 47卷
关键词
Inverse reinforcement learning; Deep learning; Density ratio estimation; Logistic regression;
D O I
暂无
中图分类号
学科分类号
摘要
This paper proposes model-free deep inverse reinforcement learning to find nonlinear reward function structures. We formulate inverse reinforcement learning as a problem of density ratio estimation, and show that the log of the ratio between an optimal state transition and a baseline one is given by a part of reward and the difference of the value functions under the framework of linearly solvable Markov decision processes. The logarithm of density ratio is efficiently calculated by binomial logistic regression, of which the classifier is constructed by the reward and state value function. The classifier tries to discriminate between samples drawn from the optimal state transition probability and those from the baseline one. Then, the estimated state value function is used to initialize the part of the deep neural networks for forward reinforcement learning. The proposed deep forward and inverse reinforcement learning is applied into two benchmark games: Atari 2600 and Reversi. Simulation results show that our method reaches the best performance substantially faster than the standard combination of forward and inverse reinforcement learning as well as behavior cloning.
引用
收藏
页码:891 / 905
页数:14
相关论文
共 50 条
  • [21] Recovering Robustness in Model-Free Reinforcement Learning
    Venkataraman, Harish K.
    Seiler, Peter J.
    [J]. 2019 AMERICAN CONTROL CONFERENCE (ACC), 2019, : 4210 - 4216
  • [22] Learning explainable task-relevant state representation for model-free deep reinforcement learning
    Zhao, Tingting
    Li, Guixi
    Zhao, Tuo
    Chen, Yarui
    Xie, Ning
    Niu, Gang
    Sugiyama, Masashi
    [J]. NEURAL NETWORKS, 2024, 180
  • [23] Shrinkage inverse regression estimation for model-free variable selection
    Bondell, Howard D.
    Li, Lexin
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2009, 71 : 287 - 299
  • [24] Correction to: Model-free inverse reinforcement learning with multi-intention, unlabeled, and overlapping demonstrations
    Ariyan Bighashdel
    Pavol Jancura
    Gijs Dubbelman
    [J]. Machine Learning, 2023, 112 : 429 - 430
  • [25] Model-Free Control for Distributed Stream Data Processing using Deep Reinforcement Learning
    Li, Teng
    Xu, Zhiyuan
    Tang, Jian
    Wang, Yanzhi
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2018, 11 (06): : 705 - 718
  • [26] Control of neural systems at multiple scales using model-free, deep reinforcement learning
    Mitchell, B. A.
    Petzold, L. R.
    [J]. SCIENTIFIC REPORTS, 2018, 8
  • [27] An Adaptive Model-Free Control Method for Metro Train Based on Deep Reinforcement Learning
    Lai, Wenzhu
    Chen, Dewang
    Huang, Yunhu
    Huang, Benzun
    [J]. ADVANCES IN NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, ICNC-FSKD 2022, 2023, 153 : 263 - 273
  • [28] Improve the Stability and Robustness of Power Management through Model-free Deep Reinforcement Learning
    Chen, Lin
    Li, Xiao
    Xu, Jiang
    [J]. PROCEEDINGS OF THE 2022 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2022), 2022, : 1371 - 1376
  • [29] Model-Free Optimal Vibration Control of a Nonlinear System Based on Deep Reinforcement Learning
    Jiang, Jiyuan
    Tang, Jie
    Zhao, Kun
    Li, Meng
    Li, Yinghui
    Cao, Dengqing
    [J]. INTERNATIONAL JOURNAL OF STRUCTURAL STABILITY AND DYNAMICS, 2024,
  • [30] Control of neural systems at multiple scales using model-free, deep reinforcement learning
    B. A. Mitchell
    L. R. Petzold
    [J]. Scientific Reports, 8