Towards Interpretable Reinforcement Learning with State Abstraction Driven by External Knowledge

被引：4

作者：

Bougie, Nicolas ^{[1
,2
]}

Ichise, Ryutaro ^{[1
,3
]}

机构：

[1] Sokendai, Tokyo 1018430, Japan

[2] Natl Inst Informat, Tokyo 1018430, Japan

[3] Natl Inst Informat, Informat Res Div, Tokyo 1018430, Japan

来源：

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS | 2020年 / E103D卷 / 10期

关键词：

reinforcement learning; symbolic reinforcement learning; reasoning about knowledge; interpretable reinforcement learning; CLASSIFIER;

D O I：

10.1587/transinf.2019EDP7170

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Advances in deep reinforcement learning have demonstrated its effectiveness in a wide variety of domains. Deep neural networks are capable of approximating value functions and policies in complex environments. However, deep neural networks inherit a number of drawbacks. Lack of interpretability limits their usability in many safety-critical real-world scenarios. Moreover, they rely on huge amounts of data to learn efficiently. This may be suitable in simulated tasks, but restricts their use to many real-world applications. Finally, their generalization capability is low, the ability to determine that a situation is similar to one encountered previously. We present a method to combine external knowledge and interpretable reinforcement learning. We derive a rule-based variant version of the Sarsa(lambda) algorithm, which we call Sarsa-rb(lambda), that augments data with prior knowledge and exploits similarities among states. We demonstrate that our approach leverages small amounts of prior knowledge to significantly accelerate the learning in multiple domains such as trading or visual navigation. The resulting agent provides substantial gains in training speed and performance over deep q-learning (DQN), deep deterministic policy gradients (DDPG), and improves stability over proximal policy optimization (PPO).

引用

页码：2143 / 2153

页数：11

共 50 条

[1] Q*-based state abstraction and knowledge discovery in reinforcement learning
Taherian, Nahid
Shiri, Mohammad Ebrahim
[J]. INTELLIGENT DATA ANALYSIS, 2014, 18 (06) : 1153 - 1175
[2] Regularizing Reinforcement Learning with State Abstraction
Akrour, Riad
Veiga, Filipe
Peters, Jan
Neumann, Gerhard
[J]. 2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2018, : 534 - 539
[3] Uniform State Abstraction for Reinforcement Learning
Burden, John
Kudenko, Daniel
[J]. ECAI 2020: 24TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, 325 : 1031 - 1038
[4] A Theory of State Abstraction for Reinforcement Learning
Abel, David
[J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 9876 - 9877
[5] Towards Interpretable Deep Reinforcement Learning Models via Inverse Reinforcement Learning
Xie, Yuansheng
Vosoughi, Soroush
Hassanpour, Saeed
[J]. 2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 5067 - 5074
[6] State abstraction in MAXQ hierarchical reinforcement learning
Dietterich, TG
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 12, 2000, 12 : 994 - 1000
[7] State abstraction for programmable reinforcement learning agents
Andre, D
Russell, SJ
[J]. EIGHTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-02)/FOURTEENTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE (IAAI-02), PROCEEDINGS, 2002, : 119 - 125
[8] Towards Interpretable Reinforcement Learning Using Attention Augmented Agents
Mott, Alex
Zoran, Daniel
Chrzanowski, Mike
Wierstra, Daan
Rezende, Danilo J.
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[9] State Abstraction in Reinforcement Learning by Eliminating Useless Dimensions
Cheng, Zhao
Ray, Laura E.
[J]. 2014 13TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2014, : 105 - 110
[10] Research on task decomposition and state abstraction in reinforcement learning
Yu Lasheng
Jiang Zhongbin
Liu Kang
[J]. Artificial Intelligence Review, 2012, 38 : 119 - 127

← 1 2 3 4 5 →