C-SPPO: A deep reinforcement learning framework for large-scale dynamic logistics UAV routing problem

被引:0
|
作者
Wang, Fei [1 ]
Zhang, Honghai [1 ,2 ]
Du, Sen [1 ]
Hua, Mingzhuang [2 ]
Zhong, Gang [1 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Civil Aviat, Nanjing 211106, Peoples R China
[2] Nanjing Univ Aeronaut & Astronaut, Coll Gen Aviat & Flight, Nanjing 211106, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Unmanned aerial vehicle; Vehicle routing problem; Order delivery; Reinforcement learning; Multi-agent; Proximal policy optimization;
D O I
10.1016/j.cja.2024.09.005
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
Unmanned Aerial Vehicle (UAV) stands as a burgeoning electric transportation carrier, holding substantial promise for the logistics sector. A reinforcement learning framework Centralized - S Proximal Policy Optimization (C-SPPO) based on centralized decision process and considering policy entropy (S) is proposed. The proposed framework aims to plan the best scheduling scheme with the objective of minimizing both the timeout of order requests and the flight impact of UAVs that may lead to conflicts. In this framework, the intents of matching act are generated through the observations of UAV agents, and the ultimate conflict-free matching results are output under the guidance of a centralized decision maker. Concurrently, a pre-activation operation is introduced to further enhance the cooperation among UAV agents. Simulation experiments based on real-world data from New York City are conducted. The results indicate that the proposed CSPPO outperforms the baseline algorithms in the Average Delay Time (ADT), the Maximum Delay Time (MDT), the Order Delay Rate (ODR), the Average Flight Distance (AFD), and the Flight Impact Ratio (FIR). Furthermore, the framework demonstrates scalability to scenarios of different sizes without requiring additional training. (c) 2024 Production and hosting by Elsevier Ltd. on behalf of Chinese Society of Aeronautics and Astronautics. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/ licenses/by-nc-nd/4.0/).
引用
收藏
页数:21
相关论文
共 50 条
  • [1] AUTONOMOUS NAVIGATION OF UAV IN LARGE-SCALE UNKNOWN COMPLEX ENVIRONMENT WITH DEEP REINFORCEMENT LEARNING
    Wang, Chao
    Wang, Jian
    Zhang, Xudong
    Zhang, Xiao
    2017 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2017), 2017, : 858 - 862
  • [2] Tractable large-scale deep reinforcement learning
    Sarang, Nima
    Poullis, Charalambos
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 232
  • [3] Deep reinforcement learning for the dynamic and uncertain vehicle routing problem
    Pan, Weixu
    Liu, Shi Qiang
    APPLIED INTELLIGENCE, 2023, 53 (01) : 405 - 422
  • [4] Deep reinforcement learning for the dynamic and uncertain vehicle routing problem
    Weixu Pan
    Shi Qiang Liu
    Applied Intelligence, 2023, 53 : 405 - 422
  • [5] Deep Reinforcement Learning for Large-Scale Epidemic Control
    Libin, Pieter J. K.
    Moonens, Arno
    Verstraeten, Timothy
    Perez-Sanjines, Fabian
    Hens, Niel
    Lemey, Philippe
    Nowe, Ann
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: APPLIED DATA SCIENCE AND DEMO TRACK, ECML PKDD 2020, PT V, 2021, 12461 : 155 - 170
  • [6] Large-scale power inspection: A deep reinforcement learning approach
    Guan, Qingshu
    Zhang, Xiangquan
    Xie, Minghui
    Nie, Jianglong
    Cao, Hui
    Chen, Zhao
    He, Zhouqiang
    FRONTIERS IN ENERGY RESEARCH, 2023, 10
  • [7] Large-Scale Wildfire Mitigation Through Deep Reinforcement Learning
    Altamimi, Abdulelah
    Lagoa, Constantino
    Borges, Jose G.
    McDill, Marc E.
    Andriotis, C. P.
    Papakonstantinou, K. G.
    FRONTIERS IN FORESTS AND GLOBAL CHANGE, 2022, 5
  • [8] A Hierarchical Reinforcement Learning Based Optimization Framework for Large-scale Dynamic Pickup and Delivery Problems
    Ma, Yi
    Hao, Xiaotian
    Hao, Jianye
    Lu, Jiawen
    Liu, Xing
    Tong, Xialiang
    Yuan, Mingxuan
    Li, Zhigang
    Tang, Jie
    Meng, Zhaopeng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [9] An End-to-end Hierarchical Reinforcement Learning Framework for Large-scale Dynamic Flexible Job-shop Scheduling Problem
    Lei, Kun
    Guo, Peng
    Wang, Yi
    Xiong, Jianyu
    Zhao, Wenchao
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [10] Weighted mean field reinforcement learning for large-scale UAV swarm confrontation
    Baolai Wang
    Shengang Li
    Xianzhong Gao
    Tao Xie
    Applied Intelligence, 2023, 53 : 5274 - 5289