Scalable Multitask Policy Gradient Reinforcement Learning

被引：0

作者：

El Bsat, Salam ^{[1
]}

Ammar, Haitham Bou ^{[2
]}

Taylor, Matthew E. ^{[3
]}

机构：

[1] Rafik Hariri Univ, Mechref, Lebanon

[2] Amer Univ Beirut, Beirut, Lebanon

[3] Washington State Univ, Pullman, WA 99164 USA

来源：

THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE | 2017年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Policy search reinforcement learning (RL) allows agents to learn autonomously with limited feedback. However, such methods typically require extensive experience for successful behavior due to their tabula rasa nature. Multitask RL is an approach, which aims to reduce data requirements by allowing knowledge transfer between tasks. Although successful, current multitask learning methods suffer from scalability issues when considering large number of tasks. The main reasons behind this limitation is the reliance on centralized solutions. This paper proposes to a novel distributed multitask RL framework, improving the scalability across many different types of tasks. Our framework maps multitask RL to an instance of general consensus and develops an efficient decentralized solver. We justify the correctness of the algorithm both theoretically and empirically: we first proof an improvement of convergence speed to an order of O(1/k) with k being the number of iterations, and then show our algorithm surpassing others on multiple dynamical system benchmarks.

引用

页码：1847 / 1853

页数：7

共 50 条

[1] Modular Multitask Reinforcement Learning with Policy Sketches
Andreas, Jacob
Klein, Dan
Levine, Sergey
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
[2] Policy gradient fuzzy reinforcement learning
Wang, XN
Xu, X
He, HG
PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2004, : 992 - 995
[3] A modification of gradient policy in reinforcement learning procedure
Abas, Marcel
Skripcak, Tomas
2012 15TH INTERNATIONAL CONFERENCE ON INTERACTIVE COLLABORATIVE LEARNING (ICL), 2012,
[4] Adaptive Natural Policy Gradient in Reinforcement Learning
Li, Dazi
Qiao, Zengyuan
Song, Tianheng
Jin, Qibing
PROCEEDINGS OF 2018 IEEE 7TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS CONFERENCE (DDCLS), 2018, : 605 - 610
[5] Policy Gradient Method For Robust Reinforcement Learning
Wang, Yue
Zou, Shaofeng
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[6] Reinforcement Learning to Rank with Pairwise Policy Gradient
Xu, Jun
Wei, Zeng
Xia, Long
Lan, Yanyan
Yin, Dawei
Cheng, Xueqi
Wen, Ji-Rong
PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 509 - 518
[7] A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning
Kim, Dong-Ki
Liu, Miao
Riemer, Matthew
Sun, Chuangchuang
Abdulhai, Marwa
Habibi, Golnaz
Lopez-Cot, Sebastian
Tesauro, Gerald
How, Jonathan P.
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[8] Scalable Feature Selection for (Multitask) Gradient Boosted Trees
Han, Cuize
Rao, Nikhil
Sorokina, Daria
Subbian, Karthik
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 885 - 893
[9] Policy gradient reinforcement learning for fast quadrupedal locomotion
Kohl, N
Stone, P
2004 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1- 5, PROCEEDINGS, 2004, : 2619 - 2624
[10] Policy Gradient using Weak Derivatives for Reinforcement Learning
Bhatt, Sujay
Koppel, Alec
Krishnamurthy, Vikram
2019 53RD ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS), 2019,

← 1 2 3 4 5 →