C-SPPO: A deep reinforcement learning framework for large-scale dynamic logistics UAV routing problem

被引:0
|
作者
Wang, Fei [1 ]
Zhang, Honghai [1 ,2 ]
Du, Sen [1 ]
Hua, Mingzhuang [2 ]
Zhong, Gang [1 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Civil Aviat, Nanjing 211106, Peoples R China
[2] Nanjing Univ Aeronaut & Astronaut, Coll Gen Aviat & Flight, Nanjing 211106, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Unmanned aerial vehicle; Vehicle routing problem; Order delivery; Reinforcement learning; Multi-agent; Proximal policy optimization;
D O I
10.1016/j.cja.2024.09.005
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
Unmanned Aerial Vehicle (UAV) stands as a burgeoning electric transportation carrier, holding substantial promise for the logistics sector. A reinforcement learning framework Centralized - S Proximal Policy Optimization (C-SPPO) based on centralized decision process and considering policy entropy (S) is proposed. The proposed framework aims to plan the best scheduling scheme with the objective of minimizing both the timeout of order requests and the flight impact of UAVs that may lead to conflicts. In this framework, the intents of matching act are generated through the observations of UAV agents, and the ultimate conflict-free matching results are output under the guidance of a centralized decision maker. Concurrently, a pre-activation operation is introduced to further enhance the cooperation among UAV agents. Simulation experiments based on real-world data from New York City are conducted. The results indicate that the proposed CSPPO outperforms the baseline algorithms in the Average Delay Time (ADT), the Maximum Delay Time (MDT), the Order Delay Rate (ODR), the Average Flight Distance (AFD), and the Flight Impact Ratio (FIR). Furthermore, the framework demonstrates scalability to scenarios of different sizes without requiring additional training. (c) 2024 Production and hosting by Elsevier Ltd. on behalf of Chinese Society of Aeronautics and Astronautics. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/ licenses/by-nc-nd/4.0/).
引用
收藏
页数:21
相关论文
共 50 条
  • [31] Cooperative Deep Reinforcement Learning for Large-Scale Traffic Grid Signal Control
    Tan, Tian
    Bao, Feng
    Deng, Yue
    Jin, Alex
    Dai, Qionghai
    Wang, Jie
    IEEE TRANSACTIONS ON CYBERNETICS, 2020, 50 (06) : 2687 - 2700
  • [32] Large-Scale Solar-Powered UAV Attitude Control Using Deep Reinforcement Learning in Hardware-in-Loop Verification
    Yan, Yongzhao
    Cao, Huazhen
    Zhang, Boyang
    Ni, Wenjun
    Wang, Bo
    Ma, Xiaoping
    DRONES, 2024, 8 (09)
  • [33] Unmanned Aerial Vehicle Path Planning Algorithm Based on Deep Reinforcement Learning in Large-Scale and Dynamic Environments
    Xie, Ronglei
    Meng, Zhijun
    Wang, Lifeng
    Li, Haochen
    Wang, Kaipeng
    Wu, Zhe
    IEEE ACCESS, 2021, 9 : 24884 - 24900
  • [34] Distributed Hierarchical Deep Reinforcement Learning for Large-Scale Grid Emergency Control
    Chen, Yixi
    Zhu, Jizhong
    Liu, Yun
    Zhang, Le
    Zhou, Jialin
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2024, 39 (02) : 4446 - 4458
  • [35] Unmanned Aerial Vehicle Path Planning Algorithm Based on Deep Reinforcement Learning in Large-Scale and Dynamic Environments
    Xie, Ronglei
    Meng, Zhijun
    Wang, Lifeng
    Li, Haochen
    Wang, Kaipeng
    Wu, Zhe
    IEEE Access, 2021, 9 : 24884 - 24900
  • [36] Dynamic Optimization for Secure MIMO Beamforming using Large-scale Reinforcement Learning
    Zhang, Xinran
    Sun, Songlin
    2019 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2019,
  • [37] Large-scale dynamic surgical scheduling under uncertainty by hierarchical reinforcement learning
    Zhao, Lixiang
    Zhu, Han
    Zhang, Min
    Tang, Jiafu
    Wang, Yu
    INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2024,
  • [38] Large-scale cost function learning for path planning using deep inverse reinforcement learning
    Wulfmeier, Markus
    Rao, Dushyant
    Wang, Dominic Zeng
    Ondruska, Peter
    Posner, Ingmar
    INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2017, 36 (10): : 1073 - 1087
  • [39] Large-Scale Home Energy Management Using Entropy-Based Collective Multiagent Deep Reinforcement Learning Framework
    Yang, Yaodong
    Hao, Jianye
    Zheng, Yan
    Yu, Chao
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 630 - 636
  • [40] RBG: Hierarchically Solving Large-Scale Routing Problems in Logistic Systems via Reinforcement Learning
    Zong, Zefang
    Wang, Hansen
    Wang, Jingwei
    Zheng, Meng
    Li, Yong
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 4648 - 4658