An Inference-Based Policy Gradient Method for Learning Options

被引:0
|
作者
Smith, Matthew J. A. [1 ]
van Hoof, Herke [2 ]
Pineau, Joelle [1 ]
机构
[1] McGill Univ, Dept Comp Sci, Montreal, PQ, Canada
[2] Univ Amsterdam, Informat Inst, Amsterdam, Netherlands
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the pursuit of increasingly intelligent learning systems, abstraction plays a vital role in enabling sophisticated decisions to be made in complex environments. The options framework provides formalism for such abstraction over sequences of decisions. However most models require that options be given a priori, presumably specified by hand, which is neither efficient, nor scalable. Indeed, it is preferable to learn options directly from interaction with the environment. Despite several efforts, this remains a difficult problem. In this work we develop a novel policy gradient method for the automatic learning of policies with options. This algorithm uses inference methods to simultaneously improve all of the options available to an agent, and thus can be employed in an off-policy manner, without observing option labels. The differentiable inference procedure employed yields options that can be easily interpreted. Empirical results confirm these attributes, and indicate that our algorithm has an improved sample efficiency relative to state-of-the-art in learning options end-to-end.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] xIR: Logic Inference-Based Retrieval Method for XML
    Lin, Peiguang
    Liu, Guangyan
    Zhang, Kangkang
    Zhou, Zhizheng
    [J]. ICICSE: 2008 INTERNATIONAL CONFERENCE ON INTERNET COMPUTING IN SCIENCE AND ENGINEERING, PROCEEDINGS, 2008, : 266 - +
  • [2] Statistical Inference-Based Cache Management for Mobile Learning
    Li, Qing
    Zhao, Jianmin
    Zhu, Xinzhong
    [J]. INTERNATIONAL JOURNAL OF DISTANCE EDUCATION TECHNOLOGIES, 2009, 7 (02) : 83 - 99
  • [3] Mobile learning support with statistical inference-based cache management
    Li, Qing
    Zhao, Jianmin
    Zhu, Xinzhong
    [J]. ADVANCES IN WEB BASED LEARNING - ICWL 2007, 2008, 4823 : 566 - 583
  • [4] The Scalable Fuzzy Inference-Based Ensemble Method for Sentiment Analysis
    Isikdemir, Yunus Emre
    Yavuz, Hasan Serhan
    [J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [5] A Probabilistic Inference-Based Efficient Path Planning Method for Quadrotors
    Xing, Siyuan
    Xian, Bin
    Jiang, Pengzhi
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2024,
  • [6] Inference-Based Information Relevance Reasoning Method in Situation Assessment
    Lu, Shan
    Kokar, Mieczyslaw
    [J]. Information (Switzerland), 2024, 15 (10)
  • [7] A Cognition Inference-based Approach for Learning Object Recommendation in E-Learning
    Cheng, Yan
    [J]. 2009 INTERNATIONAL SYMPOSIUM ON COMPUTER NETWORK AND MULTIMEDIA TECHNOLOGY (CNMT 2009), VOLUMES 1 AND 2, 2009, : 903 - 906
  • [8] INFERENCE-BASED SIMULATION OF SYSTEMS
    VAGIS, AG
    BRATUS, AS
    VASILENKO, VI
    [J]. DOPOVIDI AKADEMII NAUK UKRAINSKOI RSR SERIYA A-FIZIKO-MATEMATICHNI TA TECHNICHNI NAUKI, 1987, (10): : 59 - 61
  • [9] Inference-Based Quantum Sensing
    Alderete, C. Huerta
    Gordon, Max Hunter
    Sauvage, Frederic
    Sone, Akira
    Sornborger, Andrew T.
    Coles, Patrick J.
    Cerezo, M.
    [J]. PHYSICAL REVIEW LETTERS, 2022, 129 (19)
  • [10] Inference-based Reinforcement Learning and its Application to Dynamic Resource Allocation
    Tsiaflakis, Paschalis
    Coomans, Werner
    [J]. 2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 1621 - 1625