Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation

被引:0
|
作者
Kulkarni, Tejas D. [1 ,4 ]
Narasimhan, Karthik R. [2 ]
Saeedi, Ardavan [2 ]
Tenenbaum, Joshua B. [3 ]
机构
[1] DeepMind, London, England
[2] MIT, CSAIL, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[3] MIT, BCS, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[4] MIT, 77 Massachusetts Ave, Cambridge, MA 02139 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning goal-directed behavior in environments with sparse feedback is a major challenge for reinforcement learning algorithms. One of the key difficulties is insufficient exploration, resulting in an agent being unable to learn robust policies. Intrinsically motivated agents can explore new behavior for their own sake rather than to directly solve external goals. Such intrinsic behaviors could eventually help the agent solve tasks posed by the environment. We present hierarchical-DQN (h-DQN), a framework to integrate hierarchical action-value functions, operating at different temporal scales, with goal-driven intrinsically motivated deep reinforcement learning. A top-level q-value function learns a policy over intrinsic goals, while a lower-level function learns a policy over atomic actions to satisfy the given goals. h-DQN allows for flexible goal specifications, such as functions over entities and relations. This provides an efficient space for exploration in complicated environments. We demonstrate the strength of our approach on two problems with very sparse and delayed feedback: (1) a complex discrete stochastic decision process with stochastic transitions, and (2) the classic ATARI game - 'Montezuma's Revenge'.
引用
下载
收藏
页数:9
相关论文
共 50 条
  • [41] Intrinsic Motivation in Model-Based Reinforcement Learning: A Brief Review
    A. K. Latyshev
    A. I. Panov
    Scientific and Technical Information Processing, 2024, 51 (5) : 460 - 470
  • [42] Hierarchical Deep Reinforcement Learning for VWAP Strategy Optimization
    Li, Xiaodong
    Wu, Pangjing
    Zou, Chenxin
    Li, Qing
    IEEE TRANSACTIONS ON BIG DATA, 2024, 10 (03) : 288 - 300
  • [43] Visual Tracking via Hierarchical Deep Reinforcement Learning
    Zhang, Dawei
    Zheng, Zhonglong
    Jia, Riheng
    Li, Minglu
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 3315 - 3323
  • [44] Hierarchical Deep Reinforcement Learning for Continuous Action Control
    Yang, Zhaoyang
    Merrick, Kathryn
    Jin, Lianwen
    Abbass, Hussein A.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (11) : 5174 - 5184
  • [45] Generalizing Reinforcement Learning through Fusing Self-Supervised Learning into Intrinsic Motivation
    Wu, Keyu
    Wu, Min
    Chen, Zhenghua
    Xu, Yuecong
    Li, Xiaoli
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 8683 - 8690
  • [46] Modular neural networks for reinforcement learning with temporal intrinsic rewards
    Takeuchi, Johane
    Shouno, Osamu
    Tsujino, Hiroshi
    2007 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-6, 2007, : 1151 - 1156
  • [47] THE EFFECTS OF EXTRINSIC REINFORCEMENT ON INTRINSIC MOTIVATION
    BLOCKER, RA
    EDWARDS, RP
    PSYCHOLOGY IN THE SCHOOLS, 1982, 19 (02) : 260 - 268
  • [49] REINFORCEMENT, REWARD, AND INTRINSIC MOTIVATION - A METAANALYSIS
    CAMERON, J
    PIERCE, WD
    REVIEW OF EDUCATIONAL RESEARCH, 1994, 64 (03) : 363 - 423
  • [50] Deep Learning and Hierarchical Reinforcement Learning for modeling a Conversational Recommender System
    Basile, Pierpaolo
    Greco, Claudio
    Suglia, Alessandro
    Semeraro, Giovanni
    INTELLIGENZA ARTIFICIALE, 2018, 12 (02) : 125 - 141