Successor Options: An Option Discovery Framework for Reinforcement Learning

被引:0
|
作者
Ramesh, Rahul [1 ]
Tomar, Manan [2 ]
Ravindran, Balaraman [1 ]
机构
[1] Indian Inst Technol Madras, Dept Comp Sci & Engn, Chennai, Tamil Nadu, India
[2] Indian Inst Technol Madras, Dept Engn Design, Chennai, Tamil Nadu, India
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The options framework in reinforcement learning models the notion of a skill or a temporally extended sequence of actions. The discovery of a reusable set of skills has typically entailed building options, that navigate to bottleneck states. This work adopts a complementary approach, where we attempt to discover options that navigate to landmark states. These states are prototypical representatives of well-connected regions and can hence access the associated region with relative ease. In this work, we propose Successor Options, which leverages Successor Representations to build a model of the state space. The intra-option policies are learnt using a novel pseudo-reward and the model scales to high-dimensional spaces easily. Additionally, we also propose an Incremental Successor Options model that iterates between constructing Successor Representations and building options, which is useful when robust Successor Representations cannot be built solely from primitive actions. We demonstrate the efficacy of our approach on a collection of grid-worlds, and on the high-dimensional robotic control environment of Fetch.
引用
收藏
页码:3304 / 3310
页数:7
相关论文
共 50 条
  • [1] A Laplacian Framework for Option Discovery in Reinforcement Learning
    Machado, Marlos C.
    Bellemare, Marc G.
    Bowling, Michael
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [2] A deep structure for option discovery in Reinforcement Learning
    Mohammadi, Jahanbakhsh
    Mozayani, Nasser
    [J]. 2016 SMART GRIDS CONFERENCE (SGC), 2016, : 65 - 68
  • [3] FAOD: Fast Automatic Option Discovery in Hierarchical Reinforcement Learning
    Koudad, Zoulikha
    [J]. INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2021, 30 (02)
  • [4] An agent with a sense of direction for option discovery in hierarchical reinforcement learning
    Koudad, Zoulikha
    Merzoug, Mohamed
    Benamar, Abdelkrim
    [J]. INTERNATIONAL JOURNAL OF MODELING SIMULATION AND SCIENTIFIC COMPUTING, 2024,
  • [5] PAC-inspired Option Discovery in Lifelong Reinforcement Learning
    Brunskill, Emma
    Li, Lihong
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2), 2014, 32 : 316 - 324
  • [6] The successor representation in human reinforcement learning
    Momennejad, I.
    Russek, E. M.
    Cheong, J. H.
    Botvinick, M. M.
    Daw, N. D.
    Gershman, S. J.
    [J]. NATURE HUMAN BEHAVIOUR, 2017, 1 (09): : 680 - 692
  • [7] The successor representation in human reinforcement learning
    I. Momennejad
    E. M. Russek
    J. H. Cheong
    M. M. Botvinick
    N. D. Daw
    S. J. Gershman
    [J]. Nature Human Behaviour, 2017, 1 : 680 - 692
  • [8] Successor Features for Transfer in Reinforcement Learning
    Barreto, Andre
    Dabney, Will
    Munos, Remi
    Hunt, Jonathan J.
    Schaul, Tom
    van Hasselt, Hado
    Silver, David
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [9] Option Encoder: A Framework for Discovering a Policy Basis in Reinforcement Learning
    Manoharan, Arjun
    Ramesh, Rahul
    Ravindran, Balaraman
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2020, PT II, 2021, 12458 : 509 - 524
  • [10] Option discovery in reinforcement learning using frequent common subsequences of actions
    Girgin, Sertan
    Polat, Faruk
    [J]. INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE FOR MODELLING, CONTROL & AUTOMATION JOINTLY WITH INTERNATIONAL CONFERENCE ON INTELLIGENT AGENTS, WEB TECHNOLOGIES & INTERNET COMMERCE, VOL 1, PROCEEDINGS, 2006, : 371 - +