C-SPPO: A deep reinforcement learning framework for large-scale dynamic logistics UAV routing problem

被引:0
|
作者
Wang, Fei [1 ]
Zhang, Honghai [1 ,2 ]
Du, Sen [1 ]
Hua, Mingzhuang [2 ]
Zhong, Gang [1 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Civil Aviat, Nanjing 211106, Peoples R China
[2] Nanjing Univ Aeronaut & Astronaut, Coll Gen Aviat & Flight, Nanjing 211106, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Unmanned aerial vehicle; Vehicle routing problem; Order delivery; Reinforcement learning; Multi-agent; Proximal policy optimization;
D O I
10.1016/j.cja.2024.09.005
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
Unmanned Aerial Vehicle (UAV) stands as a burgeoning electric transportation carrier, holding substantial promise for the logistics sector. A reinforcement learning framework Centralized - S Proximal Policy Optimization (C-SPPO) based on centralized decision process and considering policy entropy (S) is proposed. The proposed framework aims to plan the best scheduling scheme with the objective of minimizing both the timeout of order requests and the flight impact of UAVs that may lead to conflicts. In this framework, the intents of matching act are generated through the observations of UAV agents, and the ultimate conflict-free matching results are output under the guidance of a centralized decision maker. Concurrently, a pre-activation operation is introduced to further enhance the cooperation among UAV agents. Simulation experiments based on real-world data from New York City are conducted. The results indicate that the proposed CSPPO outperforms the baseline algorithms in the Average Delay Time (ADT), the Maximum Delay Time (MDT), the Order Delay Rate (ODR), the Average Flight Distance (AFD), and the Flight Impact Ratio (FIR). Furthermore, the framework demonstrates scalability to scenarios of different sizes without requiring additional training. (c) 2024 Production and hosting by Elsevier Ltd. on behalf of Chinese Society of Aeronautics and Astronautics. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/ licenses/by-nc-nd/4.0/).
引用
收藏
页数:21
相关论文
共 50 条
  • [21] Deep Reinforcement Learning-Based Large-Scale Robot Exploration
    Cao, Yuhong
    Zhao, Rui
    Wang, Yizhuo
    Xiang, Bairan
    Sartoretti, Guillaume
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (05) : 4631 - 4638
  • [22] Large-Scale and Adaptive Service Composition Using Deep Reinforcement Learning
    Wang, Hongbing
    Gu, Mingzhu
    Yu, Qi
    Fei, Huanhuan
    Li, Jiajie
    Tao, Yong
    SERVICE-ORIENTED COMPUTING, ICSOC 2017, 2017, 10601 : 383 - 391
  • [23] Adaptive and large-scale service composition based on deep reinforcement learning
    Wang, Hongbing
    Gu, Mingzhu
    Yu, Qi
    Tao, Yong
    Li, Jiajie
    Fei, Huanhuan
    Yan, Jia
    Zhao, Wei
    Hong, Tianjing
    KNOWLEDGE-BASED SYSTEMS, 2019, 180 : 75 - 90
  • [24] Dynamic Dispatching for Large-Scale Heterogeneous Fleet via Multi-agent Deep Reinforcement Learning
    Zhang, Chi
    Odonkor, Philip
    Zheng, Shuai
    Khorasgani, Hamed
    Serita, Susumu
    Gupta, Chetan
    Wang, Haiyan
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 1436 - 1441
  • [25] Multi-task deep reinforcement learning for dynamic scheduling of large-scale fleets in earthmoving operations
    Zhang, Yunuo
    Zhang, Jun
    Wang, Xiaoling
    Zeng, Tuocheng
    AUTOMATION IN CONSTRUCTION, 2025, 174
  • [26] Reinforcement Learning for Sustainability: Adapting in large-scale heterogeneous dynamic environments
    Dusparic, Ivana
    2022 IEEE INTERNATIONAL CONFERENCE ON AUTONOMIC COMPUTING AND SELF-ORGANIZING SYSTEMS COMPANION (ACSOS-C 2022), 2022, : 49 - 50
  • [27] An End-to-End Deep Reinforcement Learning Framework for Electric Vehicle Routing Problem
    Wang, Mengqin
    Wei, Yanling
    Huang, Xueliang
    Gao, Shan
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (20): : 33671 - 33682
  • [28] OpenPARF: An Open-source Placement and Routing Framework for Large-scale Heterogeneous FPGAs with Deep Learning Toolkit
    Mai J.
    Wang J.
    Di Z.
    Lin Y.
    Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2023, 45 (09): : 3118 - 3131
  • [29] The third party logistics provider freight management problem: a framework and deep reinforcement learning approach
    Abbasi-Pooya, Amin
    Lash, Michael T.
    ANNALS OF OPERATIONS RESEARCH, 2024, 339 (1-2) : 965 - 1024
  • [30] DDHH: A Decentralized Deep Learning Framework for Large-scale Heterogeneous Networks
    Imran, Mubashir
    Yin, Hongzhi
    Chen, Tong
    Huang, Zi
    Zhang, Xiangliang
    Zheng, Kai
    2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021), 2021, : 2033 - 2038