Scalable Multitask Policy Gradient Reinforcement Learning

被引：0

作者：

El Bsat, Salam ^{[1
]}

Ammar, Haitham Bou ^{[2
]}

Taylor, Matthew E. ^{[3
]}

机构：

[1] Rafik Hariri Univ, Mechref, Lebanon

[2] Amer Univ Beirut, Beirut, Lebanon

[3] Washington State Univ, Pullman, WA 99164 USA

来源：

THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE | 2017年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Policy search reinforcement learning (RL) allows agents to learn autonomously with limited feedback. However, such methods typically require extensive experience for successful behavior due to their tabula rasa nature. Multitask RL is an approach, which aims to reduce data requirements by allowing knowledge transfer between tasks. Although successful, current multitask learning methods suffer from scalability issues when considering large number of tasks. The main reasons behind this limitation is the reliance on centralized solutions. This paper proposes to a novel distributed multitask RL framework, improving the scalability across many different types of tasks. Our framework maps multitask RL to an instance of general consensus and develops an efficient decentralized solver. We justify the correctness of the algorithm both theoretically and empirically: we first proof an improvement of convergence speed to an order of O(1/k) with k being the number of iterations, and then show our algorithm surpassing others on multiple dynamical system benchmarks.

引用

页码：1847 / 1853

页数：7

共 50 条

[21] Distral: Robust Multitask Reinforcement Learning
Teh, Yee Whye
Bapst, Victor
Czarnecki, Wojciech Marian
Quan, John
Kirkpatrick, James
Hadsell, Raia
Heess, Nicolas
Pascanu, Razvan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[22] Sharing Experience in Multitask Reinforcement Learning
Tung-Long Vuong
Do-Van Nguyen
Tai-Long Nguyen
Cong-Minh Bui
Hai-Dang Kieu
Viet-Cuong Ta
Quoc-Long Tran
Thanh-Ha Le
PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 3642 - 3648
[23] Scalable Multitask Representation Learning for Scene Classification
Lapin, Maksim
Schiele, Bernt
Hein, Matthias
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 1434 - 1441
[24] Provable Benefit of Multitask Representation Learning in Reinforcement Learning
Cheng, Yuan
Feng, Songtao
Yang, Jing
Zhang, Hong
Liang, Yingbin
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
[25] Multitask Learning for Object Localization With Deep Reinforcement Learning
Wang, Yan
Zhang, Lei
Wang, Lituan
Wang, Zizhou
IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2019, 11 (04) : 573 - 580
[26] Molecule generation using transformers and policy gradient reinforcement learning
Mazuz, Eyal
Shtar, Guy
Shapira, Bracha
Rokach, Lior
SCIENTIFIC REPORTS, 2023, 13 (01)
[27] Reinforcement Learning based on MPC and the Stochastic Policy Gradient Method
Gros, Sebastien
Zanon, Mario
2021 AMERICAN CONTROL CONFERENCE (ACC), 2021, : 1947 - 1952
[28] Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning
Morimura, Tetsuro
Uchibe, Eiji
Yoshimoto, Junichiro
Peters, Jan
Doya, Kenji
NEURAL COMPUTATION, 2010, 22 (02) : 342 - 376
[29] Hessian matrix distribution for Bayesian policy gradient reinforcement learning
Ngo Anh Vien
Yu, Hwanjo
Chung, TaeChoong
INFORMATION SCIENCES, 2011, 181 (09) : 1671 - 1685
[30] Using policy gradient reinforcement learning on autonomous robot controllers
Grudic, GZ
Kumar, V
Ungar, L
IROS 2003: PROCEEDINGS OF THE 2003 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-4, 2003, : 406 - 411

← 1 2 3 4 5 →