Learning to multi-vehicle cooperative bin packing problem via sequence-to-sequence policy network with deep reinforcement learning model

被引:3
|
作者
Tian, Ran [1 ]
Kang, Chunming [1 ]
Bi, Jiaming [1 ]
Ma, Zhongyu [1 ]
Liu, Yanxing [1 ]
Yang, Saisai [1 ]
Li, Fangfang [1 ]
机构
[1] Northwest Normal Univ, Dept Coll Comp Sci & Engn, Lanzhou 730070, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep Reinforcement Learning; 3D Bin Packing Policy; Position Sequence; Logistics Packing; SEARCH ALGORITHM; LOCAL SEARCH; SUPPLY CHAIN; OPTIMIZATION;
D O I
10.1016/j.cie.2023.108998
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In the logistics bin packing scenario with only rear bin doors, the packing sequence of items determines the utilization of vehicle packing space, but there is relatively little research on optimizing the packing sequence of items. Therefore, this article focuses on the bin packing sequence problem in the multi-vehicle cooperative bin packing problem(MVCBPP) and proposes a deep reinforcement learning model based on the sequence-to -sequence policy network with deep reinforcement learning model(S2SDRL). Firstly, the sequence-to-sequence neural networks model is constructed, which determines the packing probability of all items. The items will be packed by combining the bidirectional LSTM model and the attention module to construct the encoder and decoder. Secondly, the bin packing strategy of the items is obtained by the constructed reinforcement learning packing framework. Finally, the Seq2Seq policy network is updated and optimized by the policy gradient method with a baseline to obtain the current optimal packing strategy. In several bin packing scenarios, S2SDRL im-proves the average vehicle space utilization by more than 4.0% compared with the traditional packing algorithm, and the forward computation time of the model is much smaller than that of the traditional heuristic algorithm, so the model also has more realistic application value. Ablation experiments also confirm the effectiveness of the modules in the S2SDRL model for optimization of the packing order. The sensitivity analysis shows the model's some stability when the input data changes.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Deep Reinforcement Learning for Sequence-to-Sequence Models
    Keneshloo, Yaser
    Shi, Tian
    Ramakrishnan, Naren
    Reddy, Chandan K.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (07) : 2469 - 2489
  • [2] SEQUENCE-TO-SEQUENCE ASR OPTIMIZATION VIA REINFORCEMENT LEARNING
    Tjandra, Andros
    Sakti, Sakriani
    Nakamura, Satoshi
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5829 - 5833
  • [3] Learning Car-Following Behaviors for a Connected Automated Vehicle System: An Improved Sequence-to-Sequence Deep Learning Model
    Lu, Wenqi
    Yi, Ziwei
    Liang, Bingjie
    Rui, Yikang
    Ran, Bin
    IEEE ACCESS, 2023, 11 : 28076 - 28089
  • [4] Clustered Multi-Task Sequence-to-Sequence Learning for Autonomous Vehicle Repositioning
    Lee, Sangmin
    Lim, Dae-Eun
    Kang, Younkook
    Kim, Hae Joong
    IEEE ACCESS, 2021, 9 : 14504 - 14515
  • [5] Masdar: A Novel Sequence-to-Sequence Deep Learning Model for Arabic Stemming
    Fouad, Mohammed M.
    Mahany, Ahmed
    Katib, Iyad
    INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 2, 2020, 1038 : 363 - 373
  • [6] Learning to Solve 3-D Bin Packing Problem via Deep Reinforcement Learning and Constraint Programming
    Jiang, Yuan
    Cao, Zhiguang
    Zhang, Jie
    IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (05) : 2864 - 2875
  • [7] Cooperative Collision Avoidance for Multi-Vehicle Systems Using Reinforcement Learning
    Wang, Qichen
    Phillips, Chris
    2013 18TH INTERNATIONAL CONFERENCE ON METHODS AND MODELS IN AUTOMATION AND ROBOTICS (MMAR), 2013, : 98 - 102
  • [8] A sequence-to-sequence based multi-scale deep learning model for satellite cloud image prediction
    Jie Lian
    Ruirong Chen
    Earth Science Informatics, 2023, 16 : 1207 - 1225
  • [9] A sequence-to-sequence based multi-scale deep learning model for satellite cloud image prediction
    Lian, Jie
    Chen, Ruirong
    EARTH SCIENCE INFORMATICS, 2023, 16 (2) : 1207 - 1225
  • [10] Inductive Embedding Learning on Attributed Heterogeneous Networks via Multi-task Sequence-to-Sequence Learning
    Chu, Yunfei
    Guo, Caili
    He, Tongze
    Wang, Yaqing
    Hwang, Jenq-Neng
    Feng, Chunyan
    2019 19TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2019), 2019, : 1012 - 1017