Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation

被引：0

作者：

Kulkarni, Tejas D. ^{[1
,4
]}

Narasimhan, Karthik R. ^{[2
]}

Saeedi, Ardavan ^{[2
]}

Tenenbaum, Joshua B. ^{[3
]}

机构：

[1] DeepMind, London, England

[2] MIT, CSAIL, 77 Massachusetts Ave, Cambridge, MA 02139 USA

[3] MIT, BCS, 77 Massachusetts Ave, Cambridge, MA 02139 USA

[4] MIT, 77 Massachusetts Ave, Cambridge, MA 02139 USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016) | 2016年 / 29卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Learning goal-directed behavior in environments with sparse feedback is a major challenge for reinforcement learning algorithms. One of the key difficulties is insufficient exploration, resulting in an agent being unable to learn robust policies. Intrinsically motivated agents can explore new behavior for their own sake rather than to directly solve external goals. Such intrinsic behaviors could eventually help the agent solve tasks posed by the environment. We present hierarchical-DQN (h-DQN), a framework to integrate hierarchical action-value functions, operating at different temporal scales, with goal-driven intrinsically motivated deep reinforcement learning. A top-level q-value function learns a policy over intrinsic goals, while a lower-level function learns a policy over atomic actions to satisfy the given goals. h-DQN allows for flexible goal specifications, such as functions over entities and relations. This provides an efficient space for exploration in complicated environments. We demonstrate the strength of our approach on two problems with very sparse and delayed feedback: (1) a complex discrete stochastic decision process with stochastic transitions, and (2) the classic ATARI game - 'Montezuma's Revenge'.

引用

页数：9

共 50 条

[21] Deep Reinforcement Learning with Hierarchical Structures
Li, Siyuan
PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 4899 - 4900
[22] Learning Cooperative Intrinsic Motivation in Multi-Agent Reinforcement Learning
Hong, Seung-Jin
Lee, Sang-Kwang
12TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC 2021): BEYOND THE PANDEMIC ERA WITH ICT CONVERGENCE INNOVATION, 2021, : 1697 - 1699
[23] Deep Reinforcement Learning with Temporal Logics
Hasanbeig, Mohammadhosein
Kroening, Daniel
Abate, Alessandro
FORMAL MODELING AND ANALYSIS OF TIMED SYSTEMS, FORMATS 2020, 2020, 12288 : 1 - 22
[24] Emotion-Based Intrinsic Motivation for Reinforcement Learning Agents
Sequeira, Pedro
Melo, Francisco S.
Paiva, Ana
AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION, PT I, 2011, 6974 : 326 - 336
[25] Using Emotions as Intrinsic Motivation to Accelerate Classic Reinforcement Learning
Lu, Cheng-Xiang
Sun, Zhi-Yuan
Shi, Zhong-Zhi
Cao, Bao-Xiang
2016 INTERNATIONAL CONFERENCE ON INFORMATION SYSTEM AND ARTIFICIAL INTELLIGENCE (ISAI 2016), 2016, : 332 - 337
[26] Research and Development on Deep Hierarchical Reinforcement Learning
Huang Z.-G.
Liu Q.
Zhang L.-H.
Cao J.-Q.
Zhu F.
Ruan Jian Xue Bao/Journal of Software, 2023, 34 (02): : 733 - 760
[27] HIERARCHICAL CACHING VIA DEEP REINFORCEMENT LEARNING
Sadeghi, Alireza
Wang, Gang
Giannakis, Georgios B.
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3532 - 3536
[28] Unsupervised Hierarchical Temporal Abstraction by Simultaneously Learning Expectations and Representations
Metcalf, Katherine
Leake, David
PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 3144 - 3150
[29] Safe state abstraction and reusable continuing subtasks in hierarchical reinforcement learning
Hengst, Bernhard
AI 2007: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2007, 4830 : 58 - 67
[30] Intrinsic Motivation Based Hierarchical Exploration for Model and Skill Learning
Lu, Lina
Zhang, Wanpeng
Gu, Xueqiang
Chen, Jing
ELECTRONICS, 2020, 9 (02)

← 1 2 3 4 5 →