The Bottleneck Simulator: A Model-Based Deep Reinforcement Learning Approach

被引：0

作者：

Serban, Iulian Vlad ^{[1
]}

Sankar, Chinnadhurai ^{[1
]}

Pieper, Michael ^{[2
]}

Pineau, Joelle ^{[3
]}

Bengio, Yoshua ^{[1
]}

机构：

[1] Univ Montreal, Dept Comp Sci & Operat Res, Mila Quebec Artificial Intelligence Inst, Montreal, PQ, Canada

[2] Polytech Montreal, Montreal, PQ, Canada

[3] McGill Univ, Sch Comp Sci, Mila Quebec Artificial Intelligence Inst, Montreal, PQ, Canada

来源：

JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH | 2020年 / 69卷

基金：

加拿大自然科学与工程研究理事会;

关键词：

AGGREGATION;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep reinforcement learning has recently shown many impressive successes. However, one major obstacle towards applying such methods to real-world problems is their lack of data-efficiency. To this end, we propose the Bottleneck Simulator: a model-based reinforcement learning method which combines a learned, factorized transition model of the environment with rollout simulations to learn an effective policy from few examples. The learned transition model employs an abstract, discrete (bottleneck) state, which increases sample efficiency by reducing the number of model parameters and by exploiting structural properties of the environment. We provide a mathematical analysis of the Bottleneck Simulator in terms of fixed points of the learned policy, which reveals how performance is affected by four distinct sources of error: an error related to the abstract space structure, an error related to the transition model estimation variance, an error related to the transition model estimation bias, and an error related to the transition model class bias. Finally, we evaluate the Bottleneck Simulator on two natural language processing tasks: a text adventure game and a real-world, complex dialogue response selection task. On both tasks, the Bottleneck Simulator yields excellent performance beating competing approaches.

引用

页码：571 / 612

页数：42

共 50 条

[31] Model-Based Off-Policy Deep Reinforcement Learning With Model-Embedding
Tan, Xiaoyu
Qu, Chao
Xiong, Junwu
Zhang, James
Qiu, Xihe
Jin, Yaochu
[J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 8 (04): : 2974 - 2986
[32] Model-based Reinforcement Learning Approach for Deformable Linear Object Manipulation
Han, Haifeng
Paul, Gavin
Matsubara, Takamitsu
[J]. 2017 13TH IEEE CONFERENCE ON AUTOMATION SCIENCE AND ENGINEERING (CASE), 2017, : 750 - 755
[33] A model-based reinforcement learning approach using on-line clustering
Tziortziotis, Nikolaos
Blekas, Konstantinos
[J]. 2012 IEEE 24TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2012), VOL 1, 2012, : 712 - 718
[34] Model-Based Deep Reinforcement Learning Framework for Channel Access in Wireless Networks
Park, Jong In
Chae, Jun Byung
Choi, Kae Won
[J]. IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (06) : 10150 - 10167
[35] Low-Level Control of a Quadrotor With Deep Model-Based Reinforcement Learning
Lambert, Nathan O.
Drewe, Daniel S.
Yaconelli, Joseph
Levine, Sergey
Calandra, Roberto
Pister, Kristofer S. J.
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2019, 4 (04) : 4224 - 4230
[36] Integrating Deep Reinforcement Learning with Model-based Path Planners for Automated Driving
Yurtsever, Ekim
Capito, Linda
Redmill, Keith
Ozguner, Umit
[J]. 2020 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2020, : 1311 - 1316
[37] Model-based deep reinforcement learning with heuristic search for satellite attitude control
Xu, Ke
Wu, Fengge
Zhao, Junsuo
[J]. INDUSTRIAL ROBOT-THE INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH AND APPLICATION, 2019, 46 (03): : 415 - 420
[38] Combining Pessimism with Optimism for Robust and Efficient Model-Based Deep Reinforcement Learning
Curi, Sebastian
Bogunovic, Ilija
Krause, Andreas
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[39] Intelligent Trainer for Dyna-Style Model-Based Deep Reinforcement Learning
Dong, Linsen
Li, Yuanlong
Zhou, Xin
Wen, Yonggang
Guan, Kyle
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (06) : 2758 - 2771
[40] Model-Based Deep Reinforcement Learning with Traffic Inference for Traffic Signal Control
Wang, Hao
Zhu, Jinan
Gu, Bao
[J]. APPLIED SCIENCES-BASEL, 2023, 13 (06):

← 1 2 3 4 5 →