Multi-ship collaborative collision avoidance strategy based on multi-agent deep reinforcement learning

被引:0
|
作者
Huang R. [1 ,2 ,3 ]
Luo L. [1 ,2 ,3 ]
机构
[1] Key Laboratory of High Performance Ship Technology, Ministry of Education, Wuhan University of Technology, Wuhan
[2] School of Naval Architecture, Ocean and Energy Power Engineering, Wuhan University of Technology, Wuhan
[3] Sanya Science and Education Innovation Park of Wuhan University of Technology, Sanya
基金
中国国家自然科学基金;
关键词
centralized training with decentralized execution; coordinated collision avoidance; multi-agent deep reinforcement learning; multi-agent Softmax deep double deterministic policy gradient; prioritized experience replay;
D O I
10.13196/j.cims.2023.0382
中图分类号
学科分类号
摘要
To improve the coordination, safety, practicability and energy saving of intelligent collision avoidance strategy for multi-ship encounters, a Prioritized Experience Replay-Multi Agent Softmax Deep Double Deterministic Policy Gradient (PER-MASD3) algorithm was proposed by combining with the Prioritized Experience Replay mechanism under the Centralized Training with Decentralized Execution (CTDE) framework for solving the multi-ship cooperative collision avoidance problem. It not only solved the value estimation bias problem in Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm, but also introduced entropy regularization term in the process of model training to promote the exploration and control of stochastic control strategies. Adaptive noise was adopted to effectively explore tasks at different stages, further improving the learning effect and stability of the algorithm. The experiments showed that the proposed PER-MASD3 algorithm had better decision-making effect, faster convergence speed and more stable performance when it was used to solve the problem of multi-ship collaborative collision avoidance. © 2024 CIMS. All rights reserved.
引用
收藏
页码:1972 / 1988
页数:16
相关论文
共 29 条
  • [1] HE Y X, JIN Y, HUANG L W, Et al., Quantitative analysis of COLREG rules and seamanship for autonomous collision a-voidance at open sea, Ocean Engineering, 140, pp. 281-291, (2017)
  • [2] SONG Yong, Research on ship path planning algorithm [D], (2018)
  • [3] WANG X, LIU Z J, CAI Y., The ship maneuverability based collision avoidance dynamic support system in close-quarters sltuation, Ocean Engineering, 146, 8, pp. 486-497, (2017)
  • [4] ZHANG J F, ZHANG D, YAN X P, Et al., A distributed anti-collision decision support formulation in multi-ship encounter situations under COLREGs[J], Ocean Engineering, 105, 1, pp. 336-348, (2015)
  • [5] ABDELAAL M, FRANZLE M, HAHN A., Nonlinear model predictive control for trajectory tracking and collision avoidance of underactuated vessels with disturbances, Ocean Engineering, 160, 15, pp. 168-180, (2018)
  • [6] LAZAROWSKA A., A new deterministic approach in a decision support system for ship s trajectory planning, Expert Systems with Applications, 71, pp. 469-478, (2017)
  • [7] KOZYNCHENKO A I, KOZYNCHENKO S A., Applying the dynamic predictive guidance to ship collision avoidance
  • [8] Crossing case study simulation [J], Ocean Engineering, 164, 15, pp. 640-649, (2018)
  • [9] SILVER D, SINGH S, PRECUP D, Et al., Reward is enough, Artificial Intelligence, 299, C, (2021)
  • [10] SHEN H, HASHIMOTO H, MATSUDA A, Et al., Automatic collision avoidance of multiple ships based on deep Q-learning, Applied Ocean Research, 86, pp. 268-288, (2019)