Scheduling of AGVs in Automated Container Terminal Based on the Deep Deterministic Policy Gradient (DDPG) Using the Convolutional Neural Network (CNN)

被引:23
|
作者
Chen, Chun [1 ]
Hu, Zhi-Hua [1 ]
Wang, Lei [1 ]
机构
[1] Shanghai Maritime Univ, Logist Res Ctr, Shanghai 201306, Peoples R China
基金
中国国家自然科学基金;
关键词
automated container terminal; automated guided vehicles; dynamic scheduling; deep reinforcement learning; ASSIGNMENT; ALGORITHM; EQUIPMENT; STRATEGY;
D O I
10.3390/jmse9121439
中图分类号
U6 [水路运输]; P75 [海洋工程];
学科分类号
0814 ; 081505 ; 0824 ; 082401 ;
摘要
In order to improve the horizontal transportation efficiency of the terminal Automated Guided Vehicles (AGVs), it is necessary to focus on coordinating the time and space synchronization operation of the loading and unloading of equipment, the transportation of equipment during the operation, and the reduction in the completion time of the task. Traditional scheduling methods limited dynamic response capabilities and were not suitable for handling dynamic terminal operating environments. Therefore, this paper discusses how to use delivery task information and AGVs spatiotemporal information to dynamically schedule AGVs, minimizes the delay time of tasks and AGVs travel time, and proposes a deep reinforcement learning algorithm framework. The framework combines the benefits of real-time response and flexibility of the Convolutional Neural Network (CNN) and the Deep Deterministic Policy Gradient (DDPG) algorithm, and can dynamically adjust AGVs scheduling strategies according to the input spatiotemporal state information. In the framework, firstly, the AGVs scheduling process is defined as a Markov decision process, which analyzes the system's spatiotemporal state information in detail, introduces assignment heuristic rules, and rewards the reshaping mechanism in order to realize the decoupling of the model and the AGVs dynamic scheduling problem. Then, a multi-channel matrix is built to characterize space-time state information, the CNN is used to generalize and approximate the action value functions of different state information, and the DDPG algorithm is used to achieve the best AGV and container matching in the decision stage. The proposed model and algorithm frame are applied to experiments with different cases. The scheduling performance of the adaptive genetic algorithm and rolling horizon approach is compared. The results show that, compared with a single scheduling rule, the proposed algorithm improves the average performance of task completion time, task delay time, AGVs travel time and task delay rate by 15.63%, 56.16%, 16.36% and 30.22%, respectively; compared with AGA and RHPA, it reduces the tasks completion time by approximately 3.10% and 2.40%.
引用
收藏
页数:29
相关论文
共 50 条
  • [21] Neural-Network-Based Deterministic Policy Gradient for Depth Control of AUVs
    Wu, Hui
    Song, Shiji
    You, Keyou
    Wu, Cheng
    2017 CHINESE AUTOMATION CONGRESS (CAC), 2017, : 839 - 844
  • [22] Dynamic Prioritization and Adaptive Scheduling using Deep Deterministic Policy Gradient for Deploying Microservice-based VNFs
    Chetty, Swarna B.
    Ahmadi, Hamed
    Nag, Avishek
    ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, : 1487 - 1493
  • [23] Study on the Multi-Equipment Integrated Scheduling Problem of a U-Shaped Automated Container Terminal Based on Graph Neural Network and Deep Reinforcement Learning
    Zhang, Qinglei
    Zhu, Yi
    Qin, Jiyun
    Duan, Jianguo
    Zhou, Ying
    Shi, Huaixia
    Nie, Liang
    JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2025, 13 (02)
  • [24] A Self-Adaptive Vibration Reduction Method Based on Deep Deterministic Policy Gradient (DDPG) Reinforcement Learning Algorithm
    Jin, Xin
    Ma, Hongbao
    Kang, Yihua
    APPLIED SCIENCES-BASEL, 2022, 12 (19):
  • [25] Automated skin lesion segmentation using attention-based deep convolutional neural network
    Arora, Ridhi
    Raman, Balasubramanian
    Nayyar, Kritagya
    Awasthi, Ruchi
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2021, 65
  • [26] Nonlinear Nonsingular Fast Terminal Sliding Mode Control Using Deep Deterministic Policy Gradient
    Xu, Zefeng
    Huang, Wenkai
    Li, Zexuan
    Hu, Linkai
    Lu, Puwei
    APPLIED SCIENCES-BASEL, 2021, 11 (10):
  • [27] Mitigation of Scheduling Violations in Time-Sensitive Networking using Deep Deterministic Policy Gradient
    Zhou, Boyang
    Cheng, Liang
    PROCEEDINGS OF THE 4TH FLEXNETS WORKSHOP ON FLEXIBLE NETWORKS, ARTIFICIAL INTELLIGENCE SUPPORTED NETWORK FLEXIBILITY AND AGILITY (FLEXNETS'21), 2021, : 32 - 37
  • [28] Recognition and Positioning of Container Lock Holes for Intelligent Handling Terminal Based on Convolutional Neural Network
    Wang, Xue
    TRAITEMENT DU SIGNAL, 2021, 38 (02) : 467 - 472
  • [29] Continuous Control for Automated Lane Change Behavior Based on Deep Deterministic Policy Gradient Algorithm
    Wang, Pin
    Li, Hanhan
    Chan, Ching-Yao
    2019 30TH IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV19), 2019, : 1454 - 1460
  • [30] Virtual Network Function Migration Optimization Algorithm Based on Deep Deterministic Policy Gradient
    Tang Lun
    He Lanqin
    Tan Qi
    Chen Qianbin
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2021, 43 (02) : 404 - 411