Deep reinforcement learning for demand fulfillment in online retail

被引:2
|
作者
Wang, Yihua [1 ]
Minner, Stefan [1 ,2 ]
机构
[1] Tech Univ Munich, Sch Management, Logist & Supply Chain Management, D-80333 Munich, Germany
[2] Tech Univ Munich, Munich Data Sci Inst MDSI, D-85478 Garching, Germany
关键词
Demand fulfillment; Semi-markov decision processes; Deep reinforcement learning; STOCHASTIC INVENTORY CONTROL; LATERAL TRANSSHIPMENTS; APPROXIMATION SCHEME; MANAGEMENT; POLICY; ALGORITHMS;
D O I
10.1016/j.ijpe.2023.109133
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
A distinctive feature of online retail is the flexibility to ship items to customers from different distribution centers (DCs). This creates interdependence between DCs and poses new challenges in demand fulfillment to decide from which DC to satisfy each customer demand. This paper addresses a demand fulfillment problem in a multi -DC online retail environment where demand and replenishment lead time are stochastic. The objective of the problem is to minimize the long-term operational costs by determining the source DC for each customer demand. We formulate the problem as a semi-Markov decision process and develop a deep reinforcement learning (DRL) algorithm to solve the problem. To evaluate the performance of the DRL algorithm, we compare it with a set of heuristic rules and exact solutions obtained by linear programming. Numerical results show that the DRL policy performs equally well with the most competitive heuristic on complete pooling DC networks and outperforms all the heuristics on partial pooling DC networks. Additionally, by analyzing the transshipment ratio of the best -observed policies, we provide managerial insights regarding the circumstances in which transshipment is more favorable.
引用
收藏
页数:13
相关论文
共 50 条
  • [11] A deep reinforcement learning agent for geometry online tutoring
    Xiao, Ziyang
    Zhang, Dongxiang
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2023, 65 (04) : 1611 - 1625
  • [12] Faster Deep Reinforcement Learning with Slower Online Network
    Asadi, Kavosh
    Fakoor, Rasool
    Gottesman, Omer
    Kim, Taesup
    Littman, Michael L.
    Smola, Alexander J.
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [13] Generative Inverse Deep Reinforcement Learning for Online Recommendation
    Chen, Xiaocong
    Yao, Lina
    Sun, Aixin
    Wang, Xianzhi
    Xu, Xiwei
    Zhu, Liming
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 201 - 210
  • [14] Efficient Online Hyperparameter Adaptation for Deep Reinforcement Learning
    Zhou, Yinda
    Liu, Weiming
    Li, Bin
    [J]. APPLICATIONS OF EVOLUTIONARY COMPUTATION, EVOAPPLICATIONS 2019, 2019, 11454 : 141 - 155
  • [15] A deep reinforcement learning agent for geometry online tutoring
    Ziyang Xiao
    Dongxiang Zhang
    [J]. Knowledge and Information Systems, 2023, 65 : 1611 - 1625
  • [16] Optimizing Robotic Mobile Fulfillment Systems for Order Picking Based on Deep Reinforcement Learning
    Zhu, Zhenyi
    Wang, Sai
    Wang, Tuantuan
    [J]. SENSORS, 2024, 24 (14)
  • [17] Demand Forecasting of a Multinational Retail Company using Deep Learning Frameworks
    Saha, Priyam
    Gudheniya, Nitesh
    Mitra, Rony
    Das, Dyutimoy
    Narayana, Sushmita
    Tiwari, Manoj K.
    [J]. IFAC PAPERSONLINE, 2022, 55 (10): : 395 - 399
  • [18] Deep reinforcement learning for energy management in a microgrid with flexible demand
    Nakabi, Taha Abdelhalim
    Toivanen, Pekka
    [J]. SUSTAINABLE ENERGY GRIDS & NETWORKS, 2021, 25
  • [19] Adversarial Attack for Deep Reinforcement Learning Based Demand Response
    Wan, Zhiqiang
    Li, Hepeng
    Shuai, Hang
    Sun, Yan
    He, Haibo
    [J]. 2021 IEEE POWER & ENERGY SOCIETY GENERAL MEETING (PESGM), 2021,
  • [20] Retail pricing for stochastic demand with unknown parameters: an online machine learning approach
    Jia, Liyan
    Zhao, Qing
    Tong, Lang
    [J]. 2013 51ST ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2013, : 1353 - 1358