Context-Specific Representation Abstraction for Deep Option Learning

被引:0
|
作者
Abdulhai, Marwa [1 ,2 ]
Kim, Dong-Ki [1 ,2 ]
Riemer, Matthew [2 ,3 ]
Liu, Miao [2 ,3 ]
Tesauro, Gerald [2 ,3 ]
How, Jonathan P. [1 ,2 ]
机构
[1] MIT LIDS, Cambridge, MA 02139 USA
[2] MIT IBM Watson AI Lab, Cambridge, MA USA
[3] IBM Res, Armonk, NY USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Hierarchical reinforcement learning has focused on discovering temporally extended actions, such as options, that can provide benefits in problems requiring extensive exploration. One promising approach that learns these options end-to-end is the option-critic (OC) framework. We examine and show in this paper that OC does not decompose a problem into simpler sub-problems, but instead increases the size of the search over policy space with each option considering the entire state space during learning. This issue can result in practical limitations of this method, including sample inefficient learning. To address this problem, we introduce Context-Specific Representation Abstraction for Deep Option Learning (CRADOL), a new framework that considers both temporal abstraction and context-specific representation abstraction to effectively reduce the size of the search over policy space. Specifically, our method learns a factored belief state representation that enables each option to learn a policy over only a subsection of the state space. We test our method against hierarchical, non-hierarchical, and modular recurrent neural network baselines, demonstrating significant sample efficiency improvements in challenging partially observable environments.
引用
收藏
页码:5959 / 5967
页数:9
相关论文
共 50 条
  • [1] Bayesian network learning with abstraction hierarchies and context-specific independence
    desJardins, M
    Rathod, P
    Getoor, L
    [J]. MACHINE LEARNING: ECML 2005, PROCEEDINGS, 2005, 3720 : 485 - 496
  • [2] DisenCite: Graph-Based Disentangled Representation Learning for Context-Specific Citation Generation
    Wang, Yifan
    Song, Yiping
    Li, Shuai
    Cheng, Chaoran
    Ju, Wei
    Zhang, Ming
    Wang, Sheng
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 11449 - 11458
  • [3] Context-specific learning and its implications for social learning
    Truskanov, Noa
    Shy, Rimon
    Lotem, Arnon
    [J]. BEHAVIORAL ECOLOGY, 2018, 29 (05) : 1046 - 1055
  • [4] Context-specific learning, personality, and birth order
    Harris, JR
    [J]. CURRENT DIRECTIONS IN PSYCHOLOGICAL SCIENCE, 2000, 9 (05) : 174 - 177
  • [5] Learning Markov networks with context-specific independences
    Edera, Alejandro
    Schluter, Federico
    Bromberg, Facundo
    [J]. 2013 IEEE 25TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2013, : 553 - 560
  • [6] A New Perspective on Learning Context-Specific Independence
    Shen, Yujia
    Choi, Arthur
    Darwiche, Adnan
    [J]. INTERNATIONAL CONFERENCE ON PROBABILISTIC GRAPHICAL MODELS, VOL 138, 2020, 138 : 425 - 436
  • [7] Learning Context-Specific Word/Character Embeddings
    Zheng, Xiaoqing
    Feng, Jiangtao
    Chen, Yi
    Peng, Haoyuan
    Zhang, Wenqing
    [J]. THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3393 - 3399
  • [8] Context-specific interaction networks from vector representation of words
    Manica, Matteo
    Mathis, Roland
    Cadow, Joris
    Martinez, Maria Rodriguez
    [J]. NATURE MACHINE INTELLIGENCE, 2019, 1 (04) : 181 - 190
  • [9] Context-specific interaction networks from vector representation of words
    Matteo Manica
    Roland Mathis
    Joris Cadow
    María Rodríguez Martínez
    [J]. Nature Machine Intelligence, 2019, 1 : 181 - 190
  • [10] Context-specific learning of episodic integration in repetition effects
    D'Angelo, Maria C.
    Milliken, Bruce
    [J]. CANADIAN JOURNAL OF EXPERIMENTAL PSYCHOLOGY-REVUE CANADIENNE DE PSYCHOLOGIE EXPERIMENTALE, 2010, 64 (04): : 295 - 295