Particle swarm optimization based multi-task parallel reinforcement learning algorithm

被引:3
|
作者
Duan Junhua [1 ]
Zhu Yi-an [1 ]
Zhong Dong [1 ]
Zhang Lixiang [1 ]
Zhang Lin [1 ]
机构
[1] Northwestern Polytech Univ, Sch Comp, 127 West Youyi Rd, Xian 710072, Shaanxi, Peoples R China
关键词
Multi-task reinforcement learning; parallel reinforcement learning; particle swarm optimization; transfer learning;
D O I
10.3233/JIFS-190209
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Transfer learning has been identified as conducive to improving the speed of machine learning in many areas. In multi-task reinforcement learning, transfer learning can assist the transfer of experiences between different tasks. The research conducted in this article is focused on two aspects. On the one hand, multi-task parallel transfer learning can improve the learning speed of parallel learning tasks. On the other hand, the learning of the current optimal experience can help the target point rewards to be transmitted to the starting point. The value of this self-learning can also accelerate the convergence speed of the reinforcement learning. According to the research into these two aspects, this paper uses the idea of particle swarm optimization (PSO) to conduct self-learning and interactive learning in multi-task parallel learning. In this paper, a new multi-task learning algorithm named PSO-MTPRL (Multi-Task Parallel Reinforcement Learning based on PSO) is proposed. Based on the idea of PSO algorithm, the Boltzmann strategy, Self-Learning Process (SLP) and Interactive Learning Process (ILP) are selected probabilistically. Based on the characteristic exhibited by reinforcement learning, segmented learning model is recommended. In the early learning stages, the complete Boltzmann exploration strategy is applied, and B-SLP-ILP (Boltzmann-SLP- ILP) learning procedure is conducted exclusively in the middle stage of the learning. In the late learning stages, Boltzmann exploration is involved again. The segmented learning model can help ensure the balance of the exploration and exploitation, in addition to ensuring that all tasks convergence.
引用
下载
收藏
页码:8567 / 8575
页数:9
相关论文
共 50 条
  • [21] A Multi-swarm Competitive Algorithm Based on Dynamic Task Allocation Particle Swarm Optimization
    Lingjie Zhang
    Jianbo Sun
    Chen Guo
    Hui Zhang
    Arabian Journal for Science and Engineering, 2018, 43 : 8255 - 8274
  • [22] A Multi-swarm Competitive Algorithm Based on Dynamic Task Allocation Particle Swarm Optimization
    Zhang, Lingjie
    Sun, Jianbo
    Guo, Chen
    Zhang, Hui
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2018, 43 (12) : 8255 - 8274
  • [23] Multi-task reinforcement learning in humans
    Momchil S. Tomov
    Eric Schulz
    Samuel J. Gershman
    Nature Human Behaviour, 2021, 5 : 764 - 773
  • [24] Multi-task reinforcement learning in humans
    Tomov, Momchil S.
    Schulz, Eric
    Gershman, Samuel J.
    NATURE HUMAN BEHAVIOUR, 2021, 5 (06) : 764 - +
  • [25] Sparse Multi-Task Reinforcement Learning
    Calandriello, Daniele
    Lazaric, Alessandro
    Restelli, Marcello
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
  • [26] Multi-task Learning with Modular Reinforcement Learning
    Xue, Jianyong
    Alexandre, Frederic
    FROM ANIMALS TO ANIMATS 16, 2022, 13499 : 127 - 138
  • [27] Sparse multi-task reinforcement learning
    Calandriello, Daniele
    Lazaric, Alessandro
    Restelli, Marcello
    INTELLIGENZA ARTIFICIALE, 2015, 9 (01) : 5 - 20
  • [28] A Knee Point Based Coevolution Multi-objective Particle Swarm Optimization Algorithm for Heterogeneous UAV Cooperative Multi-task Allocation
    Wang F.
    Huang Z.-L.
    Han M.-C.
    Xing L.-N.
    Wang L.
    Zidonghua Xuebao/Acta Automatica Sinica, 2023, 49 (02): : 399 - 414
  • [29] Multi-task Parallel Algorithm for DSRC
    He Na
    Luo Haibiao
    Wang Ting
    Wang Bingqiang
    2ND INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND QUANTITATIVE MANAGEMENT, ITQM 2014, 2014, 31 : 1133 - 1139
  • [30] On multi-population parallel particle swarm optimization algorithm
    Zhang Dingxue
    Guan Zhihong
    Liu Xinzhi
    PROCEEDINGS OF THE 26TH CHINESE CONTROL CONFERENCE, VOL 5, 2007, : 763 - +