Adaptive Optimal Surrounding Control of Multiple Unmanned Surface Vessels via Actor-Critic Reinforcement Learning

被引:1
|
作者
Lu, Renzhi [1 ,2 ,3 ,4 ]
Wang, Xiaotao [5 ]
Ding, Yiyu [5 ]
Zhang, Hai-Tao [6 ,7 ]
Zhao, Feng [8 ]
Zhu, Lijun [9 ]
He, Yong [10 ,11 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Key Lab Image Proc & Intelligent Control, Wuhan 430074, Peoples R China
[2] Minist Educ, Key Lab Ind Internet Things & Networked Control, Chongqing 400065, Peoples R China
[3] Chongqing Univ, State Key Laboratoryof Mech Transmiss Adv Equipme, Chongqing 400044, Peoples R China
[4] Hubei Key Lab Adv Control & Intelligent Automat C, Wuhan 430074, Peoples R China
[5] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Wuhan 430074, Peoples R China
[6] Huazhong Univ Sci & Technol, Inst Artificial Intelligence, MOE Engn Res Ctr Autonomous Intelligent Unmanned, Sch Artificial Intelligence & Automat,State Key L, Wuhan 430074, Peoples R China
[7] Guangdong HUST Ind Technol Res Inst, Guangdong Prov Engn Technol Res Ctr Autonomous Un, Dongguan 523808, Peoples R China
[8] China Ship Sci Res Ctr, Wuxi 214082, Peoples R China
[9] Huazhong Univ Sci & Technol, MOE Engn Res Ctr Autonomous Intelligent Unmanned, Sch Artificial Intelligence & Automat, Wuhan 430074, Peoples R China
[10] China Univ Geosci, Sch Automat, Hubei Key Lab Adv Control & Intelligent Automat, Wuhan 430074, Peoples R China
[11] China Univ Geosci, Minist Educ, Engn Res Ctr Intelligent Technol Geoexplorat, Wuhan 430074, Peoples R China
基金
中国国家自然科学基金;
关键词
Actor-critic networks; Lyapunov functions; reinforcement learning (RL); surrounding control; unmanned surface vessels (USVs); MULTIAGENT SYSTEMS; AVOIDANCE;
D O I
10.1109/TNNLS.2024.3474289
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, an optimal surrounding control algorithm is proposed for multiple unmanned surface vessels (USVs), in which actor-critic reinforcement learning (RL) is utilized to optimize the merging process. Specifically, the multiple-USV optimal surrounding control problem is first transformed into the Hamilton-Jacobi-Bellman (HJB) equation, which is difficult to solve due to its nonlinearity. An adaptive actor-critic RL control paradigm is then proposed to obtain the optimal surround strategy, wherein the Bellman residual error is utilized to construct the network update laws. Particularly, a virtual controller representing intermediate transitions and an actual controller operating on a dynamics model are employed as surrounding control solutions for second-order USVs; thus, optimal surrounding control of the USVs is guaranteed. In addition, the stability of the proposed controller is analyzed by means of Lyapunov theory functions. Finally, numerical simulation results demonstrate that the proposed actor-critic RL-based surrounding controller can achieve the surrounding objective while optimizing the evolution process and obtains 9.76% and 20.85% reduction in trajectory length and energy consumption compared with the existing controller.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning
    Wu, Yue
    Zhai, Shuangfei
    Srivastava, Nitish
    Susskind, Joshua
    Zhang, Jian
    Salakhutdinov, Ruslan
    Goh, Hanlin
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [42] IMPROVING ACTOR-CRITIC REINFORCEMENT LEARNING VIA HAMILTONIAN MONTE CARLO METHOD
    Xu, Duo
    Fekri, Faramarz
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4018 - 4022
  • [43] CONTROLLED SENSING AND ANOMALY DETECTION VIA SOFT ACTOR-CRITIC REINFORCEMENT LEARNING
    Zhong, Chen
    Gursoy, M. Cenk
    Velipasalar, Senem
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4198 - 4202
  • [44] Deep Actor-Critic Reinforcement Learning for Anomaly Detection
    Zhong, Chen
    Gursoy, M. Cenk
    Velipasalar, Senem
    2019 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2019,
  • [45] MARS: Malleable Actor-Critic Reinforcement Learning Scheduler
    Baheri, Betis
    Tronge, Jacob
    Fang, Bo
    Li, Ang
    Chaudhary, Vipin
    Guan, Qiang
    2022 IEEE INTERNATIONAL PERFORMANCE, COMPUTING, AND COMMUNICATIONS CONFERENCE, IPCCC, 2022,
  • [46] Averaged Soft Actor-Critic for Deep Reinforcement Learning
    Ding, Feng
    Ma, Guanfeng
    Chen, Zhikui
    Gao, Jing
    Li, Peng
    COMPLEXITY, 2021, 2021
  • [47] Towards optimal control of HPV model using safe reinforcement learning with actor-critic neural networks
    Amirabadi, Roya Khalili
    Fard, Omid S.
    Farimani, Mohsen Jalaeian
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 264
  • [48] Improving Actor-Critic Reinforcement Learning via Hamiltonian Monte Carlo Method
    Xu D.
    Fekri F.
    IEEE Transactions on Artificial Intelligence, 2023, 4 (06): : 1642 - 1653
  • [49] Symmetric actor-critic deep reinforcement learning for cascade quadrotor flight control
    Han, Haoran
    Cheng, Jian
    Xi, Zhilong
    Lv, Maolong
    NEUROCOMPUTING, 2023, 559
  • [50] Adaptive Optimal Tracking Control for Uncertain Unmanned Surface Vessel via Reinforcement Learning
    Chen, Lin
    Wang, Min
    Dai, Shi-Lu
    2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 8398 - 8403