Coordinated multi-agent hierarchical deep reinforcement learning to solve multi-trip vehicle routing problems with soft time windows

被引：1

作者：

Zhang, Zixian ^{[1
]}

Qi, Geqi ^{[1
]}

Guan, Wei ^{[1
,2
]}

机构：

[1] Minist Transport, Key Lab Transport Ind Big Data Applicat Technol Co, Beijing, Peoples R China

[2] Beijing Jiaotong Univ, Key Lab Transport Ind Big Data Applicat Technol Co, Minist Transport, Beijing 100044, Peoples R China

来源：

IET INTELLIGENT TRANSPORT SYSTEMS | 2023年 / 17卷 / 10期

关键词：

goods distribution; hierarchical systems; multi-agent systems; neural nets; optimization; coordinated multi-agent; deep reinforcement learning; hierarchical layer; vehicle routing problem with time window; LOCAL SEARCH;

D O I：

10.1049/itr2.12394

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Vehicle Routing Problem (VRP) is a widespread problem in the transportation field, which challenges the intelligent level of vehicle decisions. Multi-Trip Vehicle Routing Problem with Time Windows (MTVRPTW), as a further evolved problem of VRP considering multiple departures from one depot and temporal constraint of visiting nodes, has developed into one of the critical issues in the scheduling of logistics, bus transit, railway, and aviation. Traditionally, MTVRPTW is solved by the heuristic algorithm, which is generally time-consuming and of non-steady results. Reinforcement learning (RL) and multi-agent framework have become popular in solving VRP to get better performance. However, the lack of variant dimensions in searching space and knowledge exchange between agents inhibit the further improvement of algorithms. Therefore, a Coordinated Multi-agent Hierarchical Deep Reinforcement Learning (CMA-HDRL) method is proposed in this study to enhance the overall solution quality and convergence rate by constructing a three-layered structure (time, communication, and global layers), which is particularly designed to handle the state space explosion and improve the collaboration between agents. The results show that the proposed method can significantly outperform the general genetic algorithm (GA), RL, multi-agent algorithm, and hierarchical algorithm, not only from the effectiveness on the cost consisting of travel time and penalty time but also from the operation robustness.

引用

页码：2034 / 2051

页数：18

共 50 条

[41] Important Scientific Problems of Multi-Agent Deep Reinforcement Learning
Sun C.-Y.
Mu C.-X.
[J]. Zidonghua Xuebao/Acta Automatica Sinica, 2020, 46 (07): : 1301 - 1312
[42] Routing with Graph Convolutional Networks and Multi-Agent Deep Reinforcement Learning
Bhavanasi, Sai Shreyas
Pappone, Lorenzo
Esposito, Flavio
[J]. 2022 IEEE CONFERENCE ON NETWORK FUNCTION VIRTUALIZATION AND SOFTWARE DEFINED NETWORKS (IEEE NFV-SDN), 2022, : 72 - 77
[43] Multi-Agent Deep Reinforcement Learning-Based Algorithm For Fast Generalization On Routing Problems
Barbahan, Ibraheem
Baikalov, Vladimir
Vyatkin, Valeriy
Filchenkov, Andrey
[J]. 10TH INTERNATIONAL YOUNG SCIENTISTS CONFERENCE IN COMPUTATIONAL SCIENCE (YSC2021), 2021, 193 : 228 - 238
[44] An accelerated benders decomposition algorithm for the solution of the multi-trip time-dependent vehicle routing problem with time windows
Fragkogios A.
Qiu Y.
Saharidis G.K.D.
Pardalos P.M.
[J]. European Journal of Operational Research, 2024, 317 (02) : 500 - 514
[45] MAPDP: Cooperative Multi-Agent Reinforcement Learning to Solve Pickup and Delivery Problems
Zong, Zefang
Zheng, Meng
Li, Yong
Jin, Depeng
[J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 9980 - 9988
[46] Multi-Task Multi-Objective Evolutionary Search Based on Deep Reinforcement Learning for Multi-Objective Vehicle Routing Problems with Time Windows
Deng, Jianjun
Wang, Junjie
Wang, Xiaojun
Cai, Yiqiao
Liu, Peizhong
[J]. SYMMETRY-BASEL, 2024, 16 (08):
[47] Multi-Trip Time-Dependent Vehicle Routing Problem with Split Delivery
Zhang, Jie
Zhu, Yifan
Li, Xiaobo
Ming, Mengjun
Wang, Weiping
Wang, Tao
[J]. MATHEMATICS, 2022, 10 (19)
[48] Multi-trip Vehicle Routing and Scheduling Problem with Time Window in Real Life
Sze, San-Nah
Chiew, Kang-Leng
Sze, Jeeu-Fong
[J]. NUMERICAL ANALYSIS AND APPLIED MATHEMATICS (ICNAAM 2012), VOLS A AND B, 2012, 1479 : 1151 - 1154
[49] Coordinated Slicing and Admission Control Using Multi-Agent Deep Reinforcement Learning
Sulaiman, Muhammad
Moayyedi, Arash
Ahmadi, Mahdieh
Salahuddin, Mohammad A.
Boutaba, Raouf
Saleh, Aladdin
[J]. IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2023, 20 (02): : 1110 - 1124
[50] MAGNet: Multi-agent Graph Network for Deep Multi-agent Reinforcement Learning
Malysheva, Aleksandra
Kudenko, Daniel
Shpilman, Aleksei
[J]. 2019 XVI INTERNATIONAL SYMPOSIUM PROBLEMS OF REDUNDANCY IN INFORMATION AND CONTROL SYSTEMS (REDUNDANCY), 2019, : 171 - 176

← 1 2 3 4 5 →