Model-Free Deep Inverse Reinforcement Learning by Logistic Regression

被引：1

作者：

Eiji Uchibe

机构：

[1] ATR Computational Neuroscience Labs.,Department of Brain Robot Interface

[2] Okinawa Institute of Science and Technology Graduate University,Neural Computation Unit

来源：

Neural Processing Letters | 2018年 / 47卷

关键词：

Inverse reinforcement learning; Deep learning; Density ratio estimation; Logistic regression;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

This paper proposes model-free deep inverse reinforcement learning to find nonlinear reward function structures. We formulate inverse reinforcement learning as a problem of density ratio estimation, and show that the log of the ratio between an optimal state transition and a baseline one is given by a part of reward and the difference of the value functions under the framework of linearly solvable Markov decision processes. The logarithm of density ratio is efficiently calculated by binomial logistic regression, of which the classifier is constructed by the reward and state value function. The classifier tries to discriminate between samples drawn from the optimal state transition probability and those from the baseline one. Then, the estimated state value function is used to initialize the part of the deep neural networks for forward reinforcement learning. The proposed deep forward and inverse reinforcement learning is applied into two benchmark games: Atari 2600 and Reversi. Simulation results show that our method reaches the best performance substantially faster than the standard combination of forward and inverse reinforcement learning as well as behavior cloning.

引用

页码：891 / 905

页数：14

共 50 条

[21] Recovering Robustness in Model-Free Reinforcement Learning
Venkataraman, Harish K.
Seiler, Peter J.
[J]. 2019 AMERICAN CONTROL CONFERENCE (ACC), 2019, : 4210 - 4216
[22] Learning explainable task-relevant state representation for model-free deep reinforcement learning
Zhao, Tingting
Li, Guixi
Zhao, Tuo
Chen, Yarui
Xie, Ning
Niu, Gang
Sugiyama, Masashi
[J]. NEURAL NETWORKS, 2024, 180
[23] Shrinkage inverse regression estimation for model-free variable selection
Bondell, Howard D.
Li, Lexin
[J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2009, 71 : 287 - 299
[24] Correction to: Model-free inverse reinforcement learning with multi-intention, unlabeled, and overlapping demonstrations
Ariyan Bighashdel
Pavol Jancura
Gijs Dubbelman
[J]. Machine Learning, 2023, 112 : 429 - 430
[25] Model-Free Control for Distributed Stream Data Processing using Deep Reinforcement Learning
Li, Teng
Xu, Zhiyuan
Tang, Jian
Wang, Yanzhi
[J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2018, 11 (06): : 705 - 718
[26] Control of neural systems at multiple scales using model-free, deep reinforcement learning
Mitchell, B. A.
Petzold, L. R.
[J]. SCIENTIFIC REPORTS, 2018, 8
[27] An Adaptive Model-Free Control Method for Metro Train Based on Deep Reinforcement Learning
Lai, Wenzhu
Chen, Dewang
Huang, Yunhu
Huang, Benzun
[J]. ADVANCES IN NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, ICNC-FSKD 2022, 2023, 153 : 263 - 273
[28] Improve the Stability and Robustness of Power Management through Model-free Deep Reinforcement Learning
Chen, Lin
Li, Xiao
Xu, Jiang
[J]. PROCEEDINGS OF THE 2022 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2022), 2022, : 1371 - 1376
[29] Model-Free Optimal Vibration Control of a Nonlinear System Based on Deep Reinforcement Learning
Jiang, Jiyuan
Tang, Jie
Zhao, Kun
Li, Meng
Li, Yinghui
Cao, Dengqing
[J]. INTERNATIONAL JOURNAL OF STRUCTURAL STABILITY AND DYNAMICS, 2024,
[30] Control of neural systems at multiple scales using model-free, deep reinforcement learning
B. A. Mitchell
L. R. Petzold
[J]. Scientific Reports, 8

← 1 2 3 4 5 →