The use of continuous action representations to scale deep reinforcement learning for inventory control

被引:0
|
作者
Vanvuchelen, Nathalie [1 ]
De Moor, Bram J. [2 ]
Boute, Robert N. [3 ,4 ,5 ]
机构
[1] OMP, B-2160 Wommelgem, Belgium
[2] Eindhoven Univ Technol, Dept Ind Engn & Innovat Sci, NL-5600 MB Eindhoven, Netherlands
[3] Katholieke Univ Leuven, Res Ctr Operat Management, B-3000 Leuven, Belgium
[4] Vlerick Business Sch, Technol & Operat Management Area, B-3000 Leuven, Belgium
[5] Katholieke Univ Leuven, Flanders Make, B-3000 Leuven, Belgium
基金
比利时弗兰德研究基金会;
关键词
deep reinforcement learning; continuous actions; neural networks; inventory management; S POLICIES;
D O I
10.1093/imaman/dpae031
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
Deep reinforcement learning (DRL) can solve complex inventory problems with a multi-dimensional state space. However, most approaches use a discrete action representation and do not scale well to problems with multi-dimensional action spaces. We use DRL with a continuous action representation for inventory problems with a large (multi-dimensional) discrete action space. To obtain feasible discrete actions from a continuous action representation, we add a tailored mapping function to the policy network that maps the continuous outputs of the policy network to a feasible integer solution. We demonstrate our approach to multi-product inventory control. We show how a continuous action representation solves larger problem instances and requires much less training time than a discrete action representation. Moreover, we show its performance matches state-of-the-art heuristic replenishment policies. This promising research avenue might pave the way for applying DRL in inventory control at scale and in practice.
引用
收藏
页码:51 / 66
页数:16
相关论文
共 50 条
  • [1] Hierarchical Deep Reinforcement Learning for Continuous Action Control
    Yang, Zhaoyang
    Merrick, Kathryn
    Jin, Lianwen
    Abbass, Hussein A.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (11) : 5174 - 5184
  • [2] Deep reinforcement learning in seat inventory control problem: an action generation approach
    Neda Etebari Alamdari
    Gilles Savard
    Journal of Revenue and Pricing Management, 2021, 20 : 566 - 579
  • [3] Deep reinforcement learning in seat inventory control problem: an action generation approach
    Alamdari, Neda Etebari
    Savard, Gilles
    JOURNAL OF REVENUE AND PRICING MANAGEMENT, 2021, 20 (05) : 566 - 579
  • [4] Multi-Task Deep Reinforcement Learning for Continuous Action Control
    Yang, Zhaoyang
    Merrick, Kathryn
    Abbass, Hussein
    Jin, Lianwen
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 3301 - 3307
  • [5] Deep reinforcement learning for inventory control: A roadmap
    Boute, Robert N.
    Gijsbrechts, Joren
    van Jaarsveld, Willem
    Vanvuchelen, Nathalie
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2022, 298 (02) : 401 - 412
  • [6] Benchmarking Deep Reinforcement Learning for Continuous Control
    Duan, Yan
    Chen, Xi
    Houthooft, Rein
    Schulman, John
    Abbeel, Pieter
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [7] Learning Action Representations for Reinforcement Learning
    Chandak, Yash
    Theocharous, Georgios
    Kostas, James E.
    Jordan, Scott M.
    Thomas, Philip S.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [8] Soft Action Particle Deep Reinforcement Learning for a Continuous Action Space
    Kang, Minjae
    Lee, Kyungjae
    Oh, Songhwai
    2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 5028 - 5033
  • [9] From Fly Detectors to Action Control: Representations in Reinforcement Learning
    Rusanen, Anna-Mari
    Lappi, Otto
    Pekkanen, Jami
    Kuokkanen, Jesse
    PHILOSOPHY OF SCIENCE, 2021, 88 (05) : 1045 - 1054
  • [10] Action Robust Reinforcement Learning and Applications in Continuous Control
    Tessler, Chen
    Efroni, Yonathan
    Mannor, Shie
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97