Temporal Abstraction in Reinforcement Learning with the Successor Representation

被引:0
|
作者
Machado, Marlos C. [1 ]
Barreto, Andre [2 ]
Precup, Doina [3 ]
Bowling, Michael [1 ]
机构
[1] Univ Alberta, Alberta Machine Intelligence Inst Amii, Dept Comp Sci, DeepMind, Edmonton, AB, Canada
[2] DeepMind, London, England
[3] McGill Univ, Quebec AI Inst Mila, Sch Comp Sci, DeepMind, Montreal, PQ, Canada
关键词
Reinforcement learning; Options; Successor representation; Eigenoptions; Covering options; Option keyboard; Temporally-extended exploration; SLOW FEATURE ANALYSIS; EXPLORATION; FRAMEWORK; LEVEL; MDPS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Reasoning at multiple levels of temporal abstraction is one of the key attributes of intelligence. In reinforcement learning, this is often modeled through temporally extended courses of actions called options. Options allow agents to make predictions and to operate at different levels of abstraction within an environment. Nevertheless, approaches based on the options framework often start with the assumption that a reasonable set of options is known beforehand. When this is not the case, there are no definitive answers for which op-tions one should consider. In this paper, we argue that the successor representation, which encodes states based on the pattern of state visitation that follows them, can be seen as a natural substrate for the discovery and use of temporal abstractions. To support our claim, we take a big picture view of recent results, showing how the successor representation can be used to discover options that facilitate either temp orally-extended exploration or planning. We cast these results as instantiations of a general framework for option discovery in which the agent's representation is used to identify useful options, which are then used to further improve its representation. This results in a virtuous, never-ending, cycle in which both the representation and the options are constantly refined based on each other. Beyond option discovery itself, we also discuss how the successor representation allows us to augment a set of options into a combinatorially large counterpart without additional learning. This is achieved through the combination of previously learned options. Our empirical evaluation focuses on options discovered for temp orally-extended exploration and on the use of the successor representation to combine them. Our results shed light on important design deci-sions involved in the definition of options and demonstrate the synergy of different methods based on the successor representation, such as eigenoptions and the option keyboard.
引用
收藏
页数:69
相关论文
共 50 条
  • [41] Integrating Reinforcement Learning with Models of Representation Learning
    Jones, Matt
    Canas, Fabian
    [J]. COGNITION IN FLUX, 2010, : 1258 - 1263
  • [42] Learning a Belief Representation for Delayed Reinforcement Learning
    Liotet, Pierre
    Venneri, Erick
    Restelli, Marcello
    [J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [43] Masked Contrastive Representation Learning for Reinforcement Learning
    Zhu, Jinhua
    Xia, Yingce
    Wu, Lijun
    Deng, Jiajun
    Zhou, Wengang
    Qin, Tao
    Liu, Tie-Yan
    Li, Houqiang
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (03) : 3421 - 3433
  • [44] Representation Learning on Graphs: A Reinforcement Learning Application
    Madjiheurem, Sephora
    Toni, Laura
    [J]. 22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
  • [45] Abstraction Selection in Model-Based Reinforcement Learning
    Jiang, Nan
    Kulesza, Alex
    Singh, Satinder
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 179 - 188
  • [46] A novel graphical approach to automatic abstraction in reinforcement learning
    Taghizadeh, Nasrin
    Beigy, Hamid
    [J]. ROBOTICS AND AUTONOMOUS SYSTEMS, 2013, 61 (08) : 821 - 835
  • [47] State Abstraction in Reinforcement Learning by Eliminating Useless Dimensions
    Cheng, Zhao
    Ray, Laura E.
    [J]. 2014 13TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2014, : 105 - 110
  • [48] Temporal Shift Reinforcement Learning
    Thomas, Deepak George
    Wongpiromsarn, Tichakorn
    Jannesari, Ali
    [J]. PROCEEDINGS OF THE 2022 2ND EUROPEAN WORKSHOP ON MACHINE LEARNING AND SYSTEMS (EUROMLSYS '22), 2022, : 95 - 100
  • [49] Research on task decomposition and state abstraction in reinforcement learning
    Yu Lasheng
    Jiang Zhongbin
    Liu Kang
    [J]. Artificial Intelligence Review, 2012, 38 : 119 - 127
  • [50] A Core Task Abstraction Approach to Hierarchical Reinforcement Learning
    Li, Zhuoru
    Narayan, Akshay
    Leong, Tze-Yun
    [J]. AAMAS'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2016, : 1411 - 1412