Scheduling of AGVs in Automated Container Terminal Based on the Deep Deterministic Policy Gradient (DDPG) Using the Convolutional Neural Network (CNN)

被引:23
|
作者
Chen, Chun [1 ]
Hu, Zhi-Hua [1 ]
Wang, Lei [1 ]
机构
[1] Shanghai Maritime Univ, Logist Res Ctr, Shanghai 201306, Peoples R China
基金
中国国家自然科学基金;
关键词
automated container terminal; automated guided vehicles; dynamic scheduling; deep reinforcement learning; ASSIGNMENT; ALGORITHM; EQUIPMENT; STRATEGY;
D O I
10.3390/jmse9121439
中图分类号
U6 [水路运输]; P75 [海洋工程];
学科分类号
0814 ; 081505 ; 0824 ; 082401 ;
摘要
In order to improve the horizontal transportation efficiency of the terminal Automated Guided Vehicles (AGVs), it is necessary to focus on coordinating the time and space synchronization operation of the loading and unloading of equipment, the transportation of equipment during the operation, and the reduction in the completion time of the task. Traditional scheduling methods limited dynamic response capabilities and were not suitable for handling dynamic terminal operating environments. Therefore, this paper discusses how to use delivery task information and AGVs spatiotemporal information to dynamically schedule AGVs, minimizes the delay time of tasks and AGVs travel time, and proposes a deep reinforcement learning algorithm framework. The framework combines the benefits of real-time response and flexibility of the Convolutional Neural Network (CNN) and the Deep Deterministic Policy Gradient (DDPG) algorithm, and can dynamically adjust AGVs scheduling strategies according to the input spatiotemporal state information. In the framework, firstly, the AGVs scheduling process is defined as a Markov decision process, which analyzes the system's spatiotemporal state information in detail, introduces assignment heuristic rules, and rewards the reshaping mechanism in order to realize the decoupling of the model and the AGVs dynamic scheduling problem. Then, a multi-channel matrix is built to characterize space-time state information, the CNN is used to generalize and approximate the action value functions of different state information, and the DDPG algorithm is used to achieve the best AGV and container matching in the decision stage. The proposed model and algorithm frame are applied to experiments with different cases. The scheduling performance of the adaptive genetic algorithm and rolling horizon approach is compared. The results show that, compared with a single scheduling rule, the proposed algorithm improves the average performance of task completion time, task delay time, AGVs travel time and task delay rate by 15.63%, 56.16%, 16.36% and 30.22%, respectively; compared with AGA and RHPA, it reduces the tasks completion time by approximately 3.10% and 2.40%.
引用
收藏
页数:29
相关论文
共 50 条
  • [1] Deep Deterministic Policy Gradient (DDPG)-Based Energy Harvesting Wireless Communications
    Qiu, Chengrun
    Hu, Yang
    Chen, Yan
    Zeng, Bing
    IEEE INTERNET OF THINGS JOURNAL, 2019, 6 (05): : 8577 - 8588
  • [2] Resource Adaptive Automated Task Scheduling Using Deep Deterministic Policy Gradient in Fog Computing
    Choppara, Prashanth
    Mangalampalli, S. Sudheer
    IEEE ACCESS, 2025, 13 : 25969 - 25994
  • [3] Deep Deterministic Policy Gradient (DDPG)-Based Resource Allocation Scheme for NOMA Vehicular Communications
    Xu, Yi-Han
    Yang, Cheng-Cheng
    Hua, Min
    Zhou, Wen
    IEEE ACCESS, 2020, 8 (08): : 18797 - 18807
  • [4] Deep Deterministic Policy Gradient (DDPG) Agent-Based Sliding Mode Control for Quadrotor Attitudes
    Hu, Wenjun
    Yang, Yueneng
    Liu, Zhiyang
    DRONES, 2024, 8 (03)
  • [5] Study on Equipment Network Scheduling of ZPMC Automated Container Terminal based on Simulation
    Tian, Yu
    Chen, Yao
    Yang, Bo
    INFORMATION-AN INTERNATIONAL INTERDISCIPLINARY JOURNAL, 2012, 15 (06): : 2499 - 2502
  • [6] SFNAS-DDPG: A Biomass-Based Energy Hub Dynamic Scheduling Approach via Connecting Supervised Federated Neural Architecture Search and Deep Deterministic Policy Gradient
    Dolatabadi, Amirhossein
    Abdeltawab, Hussein
    Mohamed, Yasser Abdel-Rady I.
    IEEE ACCESS, 2024, 12 : 7674 - 7688
  • [7] Optimal Scheduling of Microgrid Based on Deep Deterministic Policy Gradient and Transfer Learning
    Fan, Luqin
    Zhang, Jing
    He, Yu
    Liu, Ying
    Hu, Tao
    Zhang, Heng
    ENERGIES, 2021, 14 (03)
  • [8] Automated glaucoma detection based on deep convolutional neural network
    Ko, Yu-Chieh
    Wey, Shin-Yu
    Lee, Chen-Yi
    Liu, Catherine Jui-Ling
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2018, 59 (09)
  • [9] Yard Crane Scheduling Method Based on Deep Reinforcement Learning for the Automated Container Terminal
    Wang W.
    Huang Z.
    Zhuang Z.
    Fang H.
    Qin W.
    Jixie Gongcheng Xuebao/Journal of Mechanical Engineering, 2024, 60 (06): : 44 - 57
  • [10] Deep deterministic policy gradient and graph convolutional network for bracing direction optimization of grid shells
    Kupwiwat, Chi-tathon
    Hayashi, Kazuki
    Ohsaki, Makoto
    FRONTIERS IN BUILT ENVIRONMENT, 2022, 8