The Bottleneck Simulator: A Model-Based Deep Reinforcement Learning Approach

被引:0
|
作者
Serban, Iulian Vlad [1 ]
Sankar, Chinnadhurai [1 ]
Pieper, Michael [2 ]
Pineau, Joelle [3 ]
Bengio, Yoshua [1 ]
机构
[1] Univ Montreal, Dept Comp Sci & Operat Res, Mila Quebec Artificial Intelligence Inst, Montreal, PQ, Canada
[2] Polytech Montreal, Montreal, PQ, Canada
[3] McGill Univ, Sch Comp Sci, Mila Quebec Artificial Intelligence Inst, Montreal, PQ, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
AGGREGATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep reinforcement learning has recently shown many impressive successes. However, one major obstacle towards applying such methods to real-world problems is their lack of data-efficiency. To this end, we propose the Bottleneck Simulator: a model-based reinforcement learning method which combines a learned, factorized transition model of the environment with rollout simulations to learn an effective policy from few examples. The learned transition model employs an abstract, discrete (bottleneck) state, which increases sample efficiency by reducing the number of model parameters and by exploiting structural properties of the environment. We provide a mathematical analysis of the Bottleneck Simulator in terms of fixed points of the learned policy, which reveals how performance is affected by four distinct sources of error: an error related to the abstract space structure, an error related to the transition model estimation variance, an error related to the transition model estimation bias, and an error related to the transition model class bias. Finally, we evaluate the Bottleneck Simulator on two natural language processing tasks: a text adventure game and a real-world, complex dialogue response selection task. On both tasks, the Bottleneck Simulator yields excellent performance beating competing approaches.
引用
收藏
页码:571 / 612
页数:42
相关论文
共 50 条
  • [1] The bottleneck simulator: A model-based deep reinforcement learning approach
    Serban, Iulian Vlad
    Sankar, Chinnadhurai
    Pieper, Michael
    Pineau, Joelle
    Bengio, Yoshua
    [J]. Journal of Artificial Intelligence Research, 2020, 69 : 571 - 612
  • [2] Learning to Paint With Model-based Deep Reinforcement Learning
    Huang, Zhewei
    Heng, Wen
    Zhou, Shuchang
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8708 - 8717
  • [3] Calibrated Model-Based Deep Reinforcement Learning
    Malik, Ali
    Kuleshov, Volodymyr
    Song, Jiaming
    Nemer, Danny
    Seymour, Harlan
    Ermon, Stefano
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [4] A Contraction Approach to Model-based Reinforcement Learning
    Fan, Ting-Han
    Ramadge, Peter J.
    [J]. 24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130 : 325 - +
  • [5] An Efficient Approach to Model-Based Hierarchical Reinforcement Learning
    Li, Zhuoru
    Narayan, Akshay
    Leong, Tze-Yun
    [J]. THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3583 - 3589
  • [6] A Model-based Factored Bayesian Reinforcement Learning Approach
    Wu, Bo
    Feng, Yanpeng
    Zheng, Hongyan
    [J]. APPLIED SCIENCE, MATERIALS SCIENCE AND INFORMATION TECHNOLOGIES IN INDUSTRY, 2014, 513-517 : 1092 - 1095
  • [7] Model-based deep reinforcement learning for wind energy bidding
    Sanayha, Manassakan
    Vateekul, Peerapon
    [J]. INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2022, 136
  • [8] Knowledge Transfer using Model-Based Deep Reinforcement Learning
    Boloka, Tlou
    Makondo, Ndivhuwo
    Rosman, Benjamin
    [J]. 2021 SOUTHERN AFRICAN UNIVERSITIES POWER ENGINEERING CONFERENCE/ROBOTICS AND MECHATRONICS/PATTERN RECOGNITION ASSOCIATION OF SOUTH AFRICA (SAUPEC/ROBMECH/PRASA), 2021,
  • [9] Deep Reinforcement Learning with Model-based Acceleration for Hyperparameter Optimization
    Chen, SenPeng
    Wu, Jia
    Chen, XiuYun
    [J]. 2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 170 - 177
  • [10] SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning
    Zhang, Marvin
    Vikram, Sharad
    Smith, Laura
    Abbeel, Pieter
    Johnson, Matthew J.
    Levine, Sergey
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97