A Continuous Actor-Critic Reinforcement Learning Approach to Flocking with Fixed-Wing UAVs

被引:0
|
作者
Wang, Chang [1 ]
Yan, Chao [1 ]
Xiang, Xiaojia [1 ]
Zhou, Han [1 ]
机构
[1] Natl Univ Def Technol, Changsha, Peoples R China
基金
中国国家自然科学基金;
关键词
unmanned aerial vehicle (UAV); flocking; reinforcement learning; actor-critic; experience replay; AGENTS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Controlling a squad of fixed-wing UAVs is challenging due to the kinematics complexity and the environmental dynamics. In this paper, we develop a novel actor-critic reinforcement learning approach to solve the leader-follower flocking problem in continuous state and action spaces. Specifically, we propose a CACER algorithm that uses multilayer perceptron to represent both the actor and the critic, which has a deeper structure and provides a better function approximator than the original continuous actor-critic learning automation (CACLA) algorithm. Besides, we propose a double prioritized experience replay (DPER) mechanism to further improve the training efficiency. Specifically, the state transition samples are saved into two different experience replay buffers for updating the actor and the critic separately, based on the calculation of sample priority using the temporal difference errors. We have not only compared CACER with CACLA and a benchmark deep reinforcement learning algorithm DDPG in numerical simulation, but also demonstrated the performance of CACER in semi-physical simulation by transferring the learned policy in the numerical simulation without parameter tuning.
引用
收藏
页码:64 / 79
页数:16
相关论文
共 50 条
  • [1] Fixed-Wing UAVs flocking in continuous spaces: A deep reinforcement learning approach
    Yan, Chao
    Xiang, Xiaojia
    Wang, Chang
    [J]. ROBOTICS AND AUTONOMOUS SYSTEMS, 2020, 131
  • [2] Flocking and Collision Avoidance for a Dynamic Squad of Fixed-Wing UAVs Using Deep Reinforcement Learning
    Yan, Chao
    Xiang, Xiaojia
    Wang, Chang
    Lan, Zhen
    [J]. 2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 4738 - 4744
  • [3] Deep Reinforcement Learning Attitude Control of Fixed-Wing UAVs
    Zhen, Yan
    Hao, Mingrui
    Sun, Wendi
    [J]. PROCEEDINGS OF 2020 3RD INTERNATIONAL CONFERENCE ON UNMANNED SYSTEMS (ICUS), 2020, : 239 - 244
  • [4] Flocking with Fixed-Wing UAVs for Distributed Sensing: A Stochastic Optimal Control Approach
    Quintero, Steven A. P.
    Collins, Gaemus E.
    Hespanha, Joao P.
    [J]. 2013 AMERICAN CONTROL CONFERENCE (ACC), 2013, : 2025 - 2031
  • [5] A World Model for Actor-Critic in Reinforcement Learning
    Panov, A. I.
    Ugadiarov, L. A.
    [J]. PATTERN RECOGNITION AND IMAGE ANALYSIS, 2023, 33 (03) : 467 - 477
  • [6] Curious Hierarchical Actor-Critic Reinforcement Learning
    Roeder, Frank
    Eppe, Manfred
    Nguyen, Phuong D. H.
    Wermter, Stefan
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2020, PT II, 2020, 12397 : 408 - 419
  • [7] Actor-Critic based Improper Reinforcement Learning
    Zaki, Mohammadi
    Mohan, Avinash
    Gopalan, Aditya
    Mannor, Shie
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [8] Integrated Actor-Critic for Deep Reinforcement Learning
    Zheng, Jiaohao
    Kurt, Mehmet Necip
    Wang, Xiaodong
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT IV, 2021, 12894 : 505 - 518
  • [9] A fuzzy Actor-Critic reinforcement learning network
    Wang, Xue-Song
    Cheng, Yu-Hu
    Yi, Jian-Qiang
    [J]. INFORMATION SCIENCES, 2007, 177 (18) : 3764 - 3781
  • [10] A modified actor-critic reinforcement learning algorithm
    Mustapha, SM
    Lachiver, G
    [J]. 2000 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, CONFERENCE PROCEEDINGS, VOLS 1 AND 2: NAVIGATING TO A NEW ERA, 2000, : 605 - 609