Constrained continuous-action reinforcement learning for supply chain inventory management

被引:2
|
作者
Burtea, Radu [1 ]
Tsay, Calvin [1 ]
机构
[1] Imperial Coll London, Dept Comp, London SW7 2AZ, England
基金
英国工程与自然科学研究理事会;
关键词
Safe reinforcement learning; Optimization and machine learning toolkit; (OMLT); Continuous action Q-learning; Inventory management problem; COMPREHENSIVE SURVEY; PROGRAMMING APPROACH; SYSTEMS; OPTIMIZATION; POLICIES; MODEL;
D O I
10.1016/j.compchemeng.2023.108518
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Reinforcement learning (RL) is a promising solution for difficult decision-making problems, such as inventory management in chemical supply chains. However, enabling RL to explicitly consider known environment constraints is crucial for safe deployment in practical applications. This work incorporates recent tools for optimization over trained neural networks to introduce two algorithms for safe training and deployment of RL, with a focus on supply chains. Specifically, we use optimization over trained neural-network state-action value functions (i.e., a critic function) to directly incorporate constraints when computing actions in a continuous action space. Furthermore, we introduce a second algorithm that guarantees constraint satisfaction during deployment by directly implementing actions from constrained optimization of a trained value function. The algorithms are compared against state-of-the-art algorithms TRPO, CPO, and RCPO using a computational supply chain case study.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Continuous-Action Reinforcement Learning for Memory Allocation in Virtualized Servers
    Garrido, Luis A.
    Nishtala, Rajiv
    Carpenter, Paul
    [J]. HIGH PERFORMANCE COMPUTING: ISC HIGH PERFORMANCE 2019 INTERNATIONAL WORKSHOPS, 2020, 11887 : 13 - 24
  • [2] Human-in-the-Loop Reinforcement Learning in Continuous-Action Space
    Luo, Biao
    Wu, Zhengke
    Zhou, Fei
    Wang, Bing-Chuan
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 35 (11) : 1 - 10
  • [3] Continuous-Action Reinforcement Learning for Portfolio Allocation of a Life Insurance Company
    Abrate, Carlo
    Angius, Alessio
    Morales, Gianmarco De Francisci
    Cozzini, Stefano
    Iadanza, Francesca
    Li Puma, Laura
    Pavanelli, Simone
    Perotti, Alan
    Pignataro, Stefano
    Ronchiadin, Silvia
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021: APPLIED DATA SCIENCE TRACK, PT IV, 2021, 12978 : 237 - 252
  • [4] Learning Continuous-Action Control Policies
    Pazis, Jason
    Lagoudakis, Michail G.
    [J]. ADPRL: 2009 IEEE SYMPOSIUM ON ADAPTIVE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING, 2009, : 169 - 176
  • [5] Continuous-Action Q-Learning
    José del R. Millán
    Daniele Posenato
    Eric Dedieu
    [J]. Machine Learning, 2002, 49 : 247 - 265
  • [6] Carbon trading supply chain management based on constrained deep reinforcement learning
    Wang, Qinghao
    Yang, Yaodong
    [J]. AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2024, 38 (02)
  • [7] Orthogonal Adversarial Deep Reinforcement Learning for Discrete- and Continuous-Action Problems
    Ohashi, Kohei
    Nakanishi, Kosuke
    Goto, Nao
    Yasui, Yuji
    Ishii, Shin
    [J]. IEEE Access, 2024, 12 : 151907 - 151919
  • [8] Continuous-action Q-learning
    Millán, JDR
    Posenato, D
    Dedieu, E
    [J]. MACHINE LEARNING, 2002, 49 (2-3) : 247 - 265
  • [9] Inventory management in supply chains: a reinforcement learning approach
    Giannoccaro, I
    Pontrandolfo, P
    [J]. INTERNATIONAL JOURNAL OF PRODUCTION ECONOMICS, 2002, 78 (02) : 153 - 161
  • [10] L2NAS: Learning to Optimize Neural Architectures via Continuous-Action Reinforcement Learning
    Mils, Keith G.
    Han, Fred X.
    Salameh, Mohammad
    Rezaei, Seyed Saeed Changiz
    Kong, Linglong
    Lu, Wei
    Lian, Shuo
    Jui, Shangling
    Niu, Di
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 1284 - 1293