Solving the online batching problem using deep reinforcement learning

被引:18
|
作者
Cals, Bram [1 ]
Zhang, Yingqian [1 ]
Dijkman, Remco [1 ]
van Dorst, Claudy [2 ]
机构
[1] Eindhoven Univ Technol, Sch Ind Engn, POB 513, NL-5600 MB Eindhoven, Netherlands
[2] Vanderlande Ind BV, POB 18, NL-5460 AA Veghel, Netherlands
关键词
Deep reinforcement learning; Order batching; Sequential decision making; Machine learning; Warehousing; E-commerce; ORDER PICKING; MULTIPLE PICKERS; TARDINESS; TIME;
D O I
10.1016/j.cie.2021.107221
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In e-commerce markets, on-time delivery is of great importance to customer satisfaction. In this paper, we present a Deep Reinforcement Learning (DRL) approach, together with a heuristic, for deciding how and when arrived orders should be batched and picked in a warehouse to minimize the number of tardy orders. In particular, the technique facilitates making decisions on whether an order should be picked individually (pick-by-order) or picked in a batch with other orders (pick-by-batch), and if so, with which other orders. We approach the problem by formulating it as a semi-Markov decision process and developing a vector-based state representation that includes the characteristics of the warehouse system. This allows us to create a deep reinforcement learning solution that learns a strategy by interacting with the environment and solve the problem with a proximal policy optimization algorithm. We evaluate the performance of the proposed DRL approach by comparing it with several batching and sequencing heuristics in different problem settings. The results show that the DRL approach can develop a strategy that produces consistent, good solutions and performs better than the proposed heuristics in most of the tested cases. We show that the strategy learned by DRL is different from the hand-crafted heuristics. In this paper, we demonstrate that the benefits from recent advancements of Deep Reinforcement Learning can be transferred to solve sequential decision-making problems in warehousing operations.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Fast deep reinforcement learning using online adjustments from the past
    Hansen, Steven S.
    Sprechmann, Pablo
    Pritzel, Alexander
    Barreto, Andre
    Blundell, Charles
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [42] Deep Reinforcement Learning Based Intelligent Job Batching in Industrial Internet of Things
    Jiang, Chengling
    Luo, Zihui
    Liu, Liang
    Zheng, Xiaolong
    [J]. WIRELESS ALGORITHMS, SYSTEMS, AND APPLICATIONS, WASA 2021, PT II, 2021, 12938 : 481 - 493
  • [43] Online Adaptation of Deep Architectures with Reinforcement Learning
    Ganegedara, Thushan
    Ott, Lionel
    Ramos, Fabio
    [J]. ECAI 2016: 22ND EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, 285 : 577 - 585
  • [44] RETRACTED: Solving the protein folding problem in hydrophobic-polar model using deep reinforcement learning (Retracted Article)
    Jafari, Reza
    Javidi, Mohammad Masoud
    [J]. SN APPLIED SCIENCES, 2020, 2 (02)
  • [45] Learning Style Integrated Deep Reinforcement Learning Framework for Programming Problem Recommendation in Online Judge System
    Xu, Yuhui
    Ni, Qin
    Liu, Shuang
    Mi, Yifei
    Yu, Yangze
    Hao, Yujia
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2022, 15 (01)
  • [46] Solving PBQP-Based Register Allocation using Deep Reinforcement Learning
    Kim, Minsu
    Park, Jeong-Keun
    Moon, Soo-Mook
    [J]. CGO '22: PROCEEDINGS OF THE 2022 IEEE/ACM INTERNATIONAL SYMPOSIUM ON CODE GENERATION AND OPTIMIZATION (CGO), 2022, : 230 - 241
  • [47] Learning Style Integrated Deep Reinforcement Learning Framework for Programming Problem Recommendation in Online Judge System
    Yuhui Xu
    Qin Ni
    Shuang Liu
    Yifei Mi
    Yangze Yu
    Yujia Hao
    [J]. International Journal of Computational Intelligence Systems, 15
  • [48] Deep reinforcement learning approach for solving joint pricing and inventory problem with reference price effects
    Zhou, Qiang
    Yang, Yefei
    Fu, Shaochuan
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 195
  • [49] Solving Time-Dependent Traveling Salesman Problem with Time Windows with Deep Reinforcement Learning
    Wu, Guojin
    Zhang, Zizhen
    Liu, Hong
    Wang, Jiahai
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 558 - 563
  • [50] Solving GuanDan Poker Games with Deep Reinforcement Learning
    Ge Z.
    Xiang S.
    Tian P.
    Gao Y.
    [J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2024, 61 (01): : 145 - 155