Improving Reinforcement Learning Pre-Training with Variational Dropout

被引：0

作者：

Blau, Tom ^{[1
]}

Ott, Lionel ^{[1
]}

Ramos, Fabio ^{[1
]}

机构：

[1] Univ Sydney, Sch Informat Technol, Sydney, NSW, Australia

来源：

2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) | 2018年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reinforcement learning has been very successful at learning control policies for robotic agents in order to perform various tasks, such as driving around a track, navigating a maze, and bipedal locomotion. One significant drawback of reinforcement learning methods is that they require a large number of data points in order to learn good policies, a trait known as poor data efficiency or poor sample efficiency. One approach for improving sample efficiency is supervised pre-training of policies to directly clone the behavior of an expert, but this suffers from poor generalization far from the training data. We propose to improve this by using Gaussian dropout networks with a regularization term based on variational inference in the pre-training step. We show that this initializes policy parameters to significantly better values than standard supervised learning or random initialization, thus greatly reducing sample complexity compared with state-of-the-art methods, and enabling an RL algorithm to learn optimal policies for high-dimensional continuous control problems in a practical time frame.

引用

页码：4115 / 4122

页数：8

共 50 条

[1] Pre-training Framework for Improving Learning Speed of Reinforcement Learning based Autonomous Vehicles
Kim, Jung-Jae
Cha, Si-Ho
Ryu, Minwoo
Jo, Minho
[J]. 2019 INTERNATIONAL CONFERENCE ON ELECTRONICS, INFORMATION, AND COMMUNICATION (ICEIC), 2019, : 321 - 322
[2] Supervised pre-training for improved stability in deep reinforcement learning
Jang, Sooyoung
Kim, Hyung-Il
[J]. ICT EXPRESS, 2023, 9 (01): : 51 - 56
[3] RePreM: Representation Pre-training with Masked Model for Reinforcement Learning
Cai, Yuanying
Zhang, Chuheng
Shen, Wei
Zhang, Xuyun
Ruan, Wenjie
Huang, Longbo
[J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 6879 - 6887
[4] Improving Fractal Pre-training
Anderson, Connor
Farrell, Ryan
[J]. 2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 2412 - 2421
[5] Pre-training with asynchronous supervised learning for reinforcement learning based autonomous driving
Wang, Yunpeng
Zheng, Kunxian
Tian, Daxin
Duan, Xuting
Zhou, Jianshan
[J]. FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2021, 22 (05) : 673 - 686
[6] On the Effect of Pre-training for Transformer in Different Modality on Offline Reinforcement Learning
Takagi, Shiro
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[7] Reinforcement Learning with Action-Free Pre-Training from Videos
Seo, Younggyo
Lee, Kimin
James, Stephen
Abbeel, Pieter
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022, : 19561 - 19579
[8] Improving fault localization with pre-training
Zhang, Zhuo
Li, Ya
Xue, Jianxin
Mao, Xiaoguang
[J]. FRONTIERS OF COMPUTER SCIENCE, 2024, 18 (01)
[9] Improving fault localization with pre-training
Zhuo Zhang
Ya Li
Jianxin Xue
Xiaoguang Mao
[J]. Frontiers of Computer Science, 2024, 18
[10] APD: Learning Diverse Behaviors for Reinforcement Learning Through Unsupervised Active Pre-Training
Zeng, Kailin
Zhang, QiYuan
Chen, Bin
Liang, Bin
Yang, Jun
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (04) : 12251 - 12258

← 1 2 3 4 5 →