Efficient Deep Reinforcement Learning via Adaptive Policy Transfer

被引:0
|
作者
Yang, Tianpei [1 ,2 ]
Hao, Jianye [1 ,2 ,3 ]
Meng, Zhaopeng [1 ]
Zhang, Zongzhang [4 ]
Hu, Yujing [5 ]
Chen, Yingfeng [5 ]
Fan, Changjie [5 ]
Wang, Weixun [1 ]
Liu, Wulong [2 ]
Wang, Zhaodong [6 ]
Peng, Jiajie [1 ]
机构
[1] Tianjin Univ, Coll Intelligence & Comp, Tianjin, Peoples R China
[2] Huawei, Noahs Ark Lab, Hong Kong, Peoples R China
[3] Tianjin Key Lab Machine Learning, Tianjin, Peoples R China
[4] Nanjing Univ, Nanjing, Jiangsu, Peoples R China
[5] Fuxi AI Lab Netease, Hangzhou, Peoples R China
[6] JD Digits, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transfer learning has shown great potential to accelerate Reinforcement Learning (RL) by leveraging prior knowledge from past learned policies of relevant tasks. Existing approaches either transfer previous knowledge by explicitly computing similarities between tasks or select appropriate source policies to provide guided explorations. However, how to directly optimize the target policy by alternatively utilizing knowledge from appropriate source policies without explicitly measuring the similarities is currently missing. In this paper, we propose a novel Policy Transfer Framework (PTF) by taking advantage of this idea. PTF learns when and which source policy is the best to reuse for the target policy and when to terminate it by modeling multi-policy transfer as an option learning problem. PTF can be easily combined with existing DRL methods and experimental results show it significantly accelerates RL and surpasses state-of-the-art policy transfer methods in terms of learning efficiency and final performance in both discrete and continuous action spaces.
引用
收藏
页码:3094 / 3100
页数:7
相关论文
共 50 条
  • [1] Deep reinforcement learning and adaptive policy transfer for generalizable well control optimization
    Wang, Zhongzheng
    Zhang, Kai
    Zhang, Jinding
    Chen, Guodong
    Ma, Xiaopeng
    Xin, Guojing
    Kang, Jinzheng
    Zhao, Hanjun
    Yang, Yongfei
    [J]. JOURNAL OF PETROLEUM SCIENCE AND ENGINEERING, 2022, 217
  • [2] Deep reinforcement learning and adaptive policy transfer for generalizable well control optimization
    Wang, Zhongzheng
    Zhang, Kai
    Zhang, Jinding
    Chen, Guodong
    Ma, Xiaopeng
    Xin, Guojing
    Kang, Jinzheng
    Zhao, Hanjun
    Yang, Yongfei
    [J]. JOURNAL OF PETROLEUM SCIENCE AND ENGINEERING, 2022, 217
  • [3] Efficient Halftoning via Deep Reinforcement Learning
    Jiang, Haitian
    Xiong, Dongliang
    Jiang, Xiaowen
    Ding, Li
    Chen, Liang
    Huang, Kai
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 5494 - 5508
  • [4] Improving Deep Reinforcement Learning via Transfer
    Du, Yunshu
    [J]. AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 2405 - 2407
  • [5] Efficient Deep Reinforcement Learning via Policy-Extended Successor Feature Approximator
    Li, Yining
    Yang, Tianpei
    Hao, Jianye
    Zheng, Yan
    Tang, Hongyao
    [J]. DISTRIBUTED ARTIFICIAL INTELLIGENCE, DAI 2022, 2023, 13824 : 29 - 44
  • [6] An efficient and robust gradient reinforcement learning: Deep comparative policy
    Wang, Jiaguo
    Li, Wenheng
    Lei, Chao
    Yang, Meng
    Pei, Yang
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2024, 46 (02) : 3773 - 3788
  • [7] Adaptable automation with modular deep reinforcement learning and policy transfer
    Raziei, Zohreh
    Moghaddam, Mohsen
    [J]. Engineering Applications of Artificial Intelligence, 2021, 103
  • [8] Adaptable automation with modular deep reinforcement learning and policy transfer
    Raziei, Zohreh
    Moghaddam, Mohsen
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2021, 103
  • [9] Efficient deep reinforcement learning under task variations via knowledge transfer for drone control
    Jang, Sooyoung
    Kim, Hyung-Il
    [J]. ICT EXPRESS, 2024, 10 (03): : 576 - 582
  • [10] Optimizing Policy via Deep Reinforcement Learning for Dialogue Management
    Xu, Guanghao
    Lee, Hyunjung
    Koo, Myoung-Wan
    Seo, Jungyun
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2018, : 582 - 589