Successor Options: An Option Discovery Framework for Reinforcement Learning

被引：0

作者：

Ramesh, Rahul ^{[1
]}

Tomar, Manan ^{[2
]}

Ravindran, Balaraman ^{[1
]}

机构：

[1] Indian Inst Technol Madras, Dept Comp Sci & Engn, Chennai, Tamil Nadu, India

[2] Indian Inst Technol Madras, Dept Engn Design, Chennai, Tamil Nadu, India

来源：

PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE | 2019年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The options framework in reinforcement learning models the notion of a skill or a temporally extended sequence of actions. The discovery of a reusable set of skills has typically entailed building options, that navigate to bottleneck states. This work adopts a complementary approach, where we attempt to discover options that navigate to landmark states. These states are prototypical representatives of well-connected regions and can hence access the associated region with relative ease. In this work, we propose Successor Options, which leverages Successor Representations to build a model of the state space. The intra-option policies are learnt using a novel pseudo-reward and the model scales to high-dimensional spaces easily. Additionally, we also propose an Incremental Successor Options model that iterates between constructing Successor Representations and building options, which is useful when robust Successor Representations cannot be built solely from primitive actions. We demonstrate the efficacy of our approach on a collection of grid-worlds, and on the high-dimensional robotic control environment of Fetch.

引用

页码：3304 / 3310

页数：7

共 50 条

[1] A Laplacian Framework for Option Discovery in Reinforcement Learning
Machado, Marlos C.
Bellemare, Marc G.
Bowling, Michael
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
[2] A deep structure for option discovery in Reinforcement Learning
Mohammadi, Jahanbakhsh
Mozayani, Nasser
[J]. 2016 SMART GRIDS CONFERENCE (SGC), 2016, : 65 - 68
[3] FAOD: Fast Automatic Option Discovery in Hierarchical Reinforcement Learning
Koudad, Zoulikha
[J]. INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2021, 30 (02)
[4] An agent with a sense of direction for option discovery in hierarchical reinforcement learning
Koudad, Zoulikha
Merzoug, Mohamed
Benamar, Abdelkrim
[J]. INTERNATIONAL JOURNAL OF MODELING SIMULATION AND SCIENTIFIC COMPUTING, 2024,
[5] PAC-inspired Option Discovery in Lifelong Reinforcement Learning
Brunskill, Emma
Li, Lihong
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2), 2014, 32 : 316 - 324
[6] The successor representation in human reinforcement learning
Momennejad, I.
Russek, E. M.
Cheong, J. H.
Botvinick, M. M.
Daw, N. D.
Gershman, S. J.
[J]. NATURE HUMAN BEHAVIOUR, 2017, 1 (09): : 680 - 692
[7] The successor representation in human reinforcement learning
I. Momennejad
E. M. Russek
J. H. Cheong
M. M. Botvinick
N. D. Daw
S. J. Gershman
[J]. Nature Human Behaviour, 2017, 1 : 680 - 692
[8] Successor Features for Transfer in Reinforcement Learning
Barreto, Andre
Dabney, Will
Munos, Remi
Hunt, Jonathan J.
Schaul, Tom
van Hasselt, Hado
Silver, David
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[9] Option Encoder: A Framework for Discovering a Policy Basis in Reinforcement Learning
Manoharan, Arjun
Ramesh, Rahul
Ravindran, Balaraman
[J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2020, PT II, 2021, 12458 : 509 - 524
[10] Option discovery in reinforcement learning using frequent common subsequences of actions
Girgin, Sertan
Polat, Faruk
[J]. INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE FOR MODELLING, CONTROL & AUTOMATION JOINTLY WITH INTERNATIONAL CONFERENCE ON INTELLIGENT AGENTS, WEB TECHNOLOGIES & INTERNET COMMERCE, VOL 1, PROCEEDINGS, 2006, : 371 - +

← 1 2 3 4 5 →