Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation

被引：0

作者：

Kulkarni, Tejas D. ^{[1
,4
]}

Narasimhan, Karthik R. ^{[2
]}

Saeedi, Ardavan ^{[2
]}

Tenenbaum, Joshua B. ^{[3
]}

机构：

[1] DeepMind, London, England

[2] MIT, CSAIL, 77 Massachusetts Ave, Cambridge, MA 02139 USA

[3] MIT, BCS, 77 Massachusetts Ave, Cambridge, MA 02139 USA

[4] MIT, 77 Massachusetts Ave, Cambridge, MA 02139 USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016) | 2016年 / 29卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Learning goal-directed behavior in environments with sparse feedback is a major challenge for reinforcement learning algorithms. One of the key difficulties is insufficient exploration, resulting in an agent being unable to learn robust policies. Intrinsically motivated agents can explore new behavior for their own sake rather than to directly solve external goals. Such intrinsic behaviors could eventually help the agent solve tasks posed by the environment. We present hierarchical-DQN (h-DQN), a framework to integrate hierarchical action-value functions, operating at different temporal scales, with goal-driven intrinsically motivated deep reinforcement learning. A top-level q-value function learns a policy over intrinsic goals, while a lower-level function learns a policy over atomic actions to satisfy the given goals. h-DQN allows for flexible goal specifications, such as functions over entities and relations. This provides an efficient space for exploration in complicated environments. We demonstrate the strength of our approach on two problems with very sparse and delayed feedback: (1) a complex discrete stochastic decision process with stochastic transitions, and (2) the classic ATARI game - 'Montezuma's Revenge'.

引用

下载

页数：9

共 50 条

[41] Intrinsic Motivation in Model-Based Reinforcement Learning: A Brief Review
A. K. Latyshev
A. I. Panov
Scientific and Technical Information Processing, 2024, 51 (5) : 460 - 470
[42] Hierarchical Deep Reinforcement Learning for VWAP Strategy Optimization
Li, Xiaodong
Wu, Pangjing
Zou, Chenxin
Li, Qing
IEEE TRANSACTIONS ON BIG DATA, 2024, 10 (03) : 288 - 300
[43] Visual Tracking via Hierarchical Deep Reinforcement Learning
Zhang, Dawei
Zheng, Zhonglong
Jia, Riheng
Li, Minglu
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 3315 - 3323
[44] Hierarchical Deep Reinforcement Learning for Continuous Action Control
Yang, Zhaoyang
Merrick, Kathryn
Jin, Lianwen
Abbass, Hussein A.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (11) : 5174 - 5184
[45] Generalizing Reinforcement Learning through Fusing Self-Supervised Learning into Intrinsic Motivation
Wu, Keyu
Wu, Min
Chen, Zhenghua
Xu, Yuecong
Li, Xiaoli
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 8683 - 8690
[46] Modular neural networks for reinforcement learning with temporal intrinsic rewards
Takeuchi, Johane
Shouno, Osamu
Tsujino, Hiroshi
2007 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-6, 2007, : 1151 - 1156
[47] THE EFFECTS OF EXTRINSIC REINFORCEMENT ON INTRINSIC MOTIVATION
BLOCKER, RA
EDWARDS, RP
PSYCHOLOGY IN THE SCHOOLS, 1982, 19 (02) : 260 - 268
[48] INTRINSIC MOTIVATION, EXTRINSIC REINFORCEMENT, AND INEQUITY
DECI, EL
JOURNAL OF PERSONALITY AND SOCIAL PSYCHOLOGY, 1972, 22 (01) : 113 - &
[49] REINFORCEMENT, REWARD, AND INTRINSIC MOTIVATION - A METAANALYSIS
CAMERON, J
PIERCE, WD
REVIEW OF EDUCATIONAL RESEARCH, 1994, 64 (03) : 363 - 423
[50] Deep Learning and Hierarchical Reinforcement Learning for modeling a Conversational Recommender System
Basile, Pierpaolo
Greco, Claudio
Suglia, Alessandro
Semeraro, Giovanni
INTELLIGENZA ARTIFICIALE, 2018, 12 (02) : 125 - 141

← 1 2 3 4 5 →