Scalable Multitask Policy Gradient Reinforcement Learning

被引:0
|
作者
El Bsat, Salam [1 ]
Ammar, Haitham Bou [2 ]
Taylor, Matthew E. [3 ]
机构
[1] Rafik Hariri Univ, Mechref, Lebanon
[2] Amer Univ Beirut, Beirut, Lebanon
[3] Washington State Univ, Pullman, WA 99164 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Policy search reinforcement learning (RL) allows agents to learn autonomously with limited feedback. However, such methods typically require extensive experience for successful behavior due to their tabula rasa nature. Multitask RL is an approach, which aims to reduce data requirements by allowing knowledge transfer between tasks. Although successful, current multitask learning methods suffer from scalability issues when considering large number of tasks. The main reasons behind this limitation is the reliance on centralized solutions. This paper proposes to a novel distributed multitask RL framework, improving the scalability across many different types of tasks. Our framework maps multitask RL to an instance of general consensus and develops an efficient decentralized solver. We justify the correctness of the algorithm both theoretically and empirically: we first proof an improvement of convergence speed to an order of O(1/k) with k being the number of iterations, and then show our algorithm surpassing others on multiple dynamical system benchmarks.
引用
收藏
页码:1847 / 1853
页数:7
相关论文
共 50 条
  • [1] Modular Multitask Reinforcement Learning with Policy Sketches
    Andreas, Jacob
    Klein, Dan
    Levine, Sergey
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [2] Policy gradient fuzzy reinforcement learning
    Wang, XN
    Xu, X
    He, HG
    [J]. PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2004, : 992 - 995
  • [3] A modification of gradient policy in reinforcement learning procedure
    Abas, Marcel
    Skripcak, Tomas
    [J]. 2012 15TH INTERNATIONAL CONFERENCE ON INTERACTIVE COLLABORATIVE LEARNING (ICL), 2012,
  • [4] Adaptive Natural Policy Gradient in Reinforcement Learning
    Li, Dazi
    Qiao, Zengyuan
    Song, Tianheng
    Jin, Qibing
    [J]. PROCEEDINGS OF 2018 IEEE 7TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS CONFERENCE (DDCLS), 2018, : 605 - 610
  • [5] Policy Gradient Method For Robust Reinforcement Learning
    Wang, Yue
    Zou, Shaofeng
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [6] Reinforcement Learning to Rank with Pairwise Policy Gradient
    Xu, Jun
    Wei, Zeng
    Xia, Long
    Lan, Yanyan
    Yin, Dawei
    Cheng, Xueqi
    Wen, Ji-Rong
    [J]. PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 509 - 518
  • [7] A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning
    Kim, Dong-Ki
    Liu, Miao
    Riemer, Matthew
    Sun, Chuangchuang
    Abdulhai, Marwa
    Habibi, Golnaz
    Lopez-Cot, Sebastian
    Tesauro, Gerald
    How, Jonathan P.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [8] Scalable Feature Selection for (Multitask) Gradient Boosted Trees
    Han, Cuize
    Rao, Nikhil
    Sorokina, Daria
    Subbian, Karthik
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 885 - 893
  • [9] On the use of the policy gradient and Hessian in inverse reinforcement learning
    Metelli, Alberto Maria
    Pirotta, Matteo
    Restelli, Marcello
    [J]. INTELLIGENZA ARTIFICIALE, 2020, 14 (01) : 117 - 150
  • [10] Policy gradient reinforcement learning for fast quadrupedal locomotion
    Kohl, N
    Stone, P
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1- 5, PROCEEDINGS, 2004, : 2619 - 2624