Constrained continuous-action reinforcement learning for supply chain inventory management

被引：2

作者：

Burtea, Radu ^{[1
]}

Tsay, Calvin ^{[1
]}

机构：

[1] Imperial Coll London, Dept Comp, London SW7 2AZ, England

来源：

COMPUTERS & CHEMICAL ENGINEERING | 2024年 / 181卷

基金：

英国工程与自然科学研究理事会;

关键词：

Safe reinforcement learning; Optimization and machine learning toolkit; (OMLT); Continuous action Q-learning; Inventory management problem; COMPREHENSIVE SURVEY; PROGRAMMING APPROACH; SYSTEMS; OPTIMIZATION; POLICIES; MODEL;

D O I：

10.1016/j.compchemeng.2023.108518

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Reinforcement learning (RL) is a promising solution for difficult decision-making problems, such as inventory management in chemical supply chains. However, enabling RL to explicitly consider known environment constraints is crucial for safe deployment in practical applications. This work incorporates recent tools for optimization over trained neural networks to introduce two algorithms for safe training and deployment of RL, with a focus on supply chains. Specifically, we use optimization over trained neural-network state-action value functions (i.e., a critic function) to directly incorporate constraints when computing actions in a continuous action space. Furthermore, we introduce a second algorithm that guarantees constraint satisfaction during deployment by directly implementing actions from constrained optimization of a trained value function. The algorithms are compared against state-of-the-art algorithms TRPO, CPO, and RCPO using a computational supply chain case study.

引用

页数：12

共 50 条

[1] Continuous-Action Reinforcement Learning for Memory Allocation in Virtualized Servers
Garrido, Luis A.
Nishtala, Rajiv
Carpenter, Paul
[J]. HIGH PERFORMANCE COMPUTING: ISC HIGH PERFORMANCE 2019 INTERNATIONAL WORKSHOPS, 2020, 11887 : 13 - 24
[2] Human-in-the-Loop Reinforcement Learning in Continuous-Action Space
Luo, Biao
Wu, Zhengke
Zhou, Fei
Wang, Bing-Chuan
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 35 (11) : 1 - 10
[3] Continuous-Action Reinforcement Learning for Portfolio Allocation of a Life Insurance Company
Abrate, Carlo
Angius, Alessio
Morales, Gianmarco De Francisci
Cozzini, Stefano
Iadanza, Francesca
Li Puma, Laura
Pavanelli, Simone
Perotti, Alan
Pignataro, Stefano
Ronchiadin, Silvia
[J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021: APPLIED DATA SCIENCE TRACK, PT IV, 2021, 12978 : 237 - 252
[4] Learning Continuous-Action Control Policies
Pazis, Jason
Lagoudakis, Michail G.
[J]. ADPRL: 2009 IEEE SYMPOSIUM ON ADAPTIVE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING, 2009, : 169 - 176
[5] Continuous-Action Q-Learning
José del R. Millán
Daniele Posenato
Eric Dedieu
[J]. Machine Learning, 2002, 49 : 247 - 265
[6] Carbon trading supply chain management based on constrained deep reinforcement learning
Wang, Qinghao
Yang, Yaodong
[J]. AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2024, 38 (02)
[7] Orthogonal Adversarial Deep Reinforcement Learning for Discrete- and Continuous-Action Problems
Ohashi, Kohei
Nakanishi, Kosuke
Goto, Nao
Yasui, Yuji
Ishii, Shin
[J]. IEEE Access, 2024, 12 : 151907 - 151919
[8] Continuous-action Q-learning
Millán, JDR
Posenato, D
Dedieu, E
[J]. MACHINE LEARNING, 2002, 49 (2-3) : 247 - 265
[9] Inventory management in supply chains: a reinforcement learning approach
Giannoccaro, I
Pontrandolfo, P
[J]. INTERNATIONAL JOURNAL OF PRODUCTION ECONOMICS, 2002, 78 (02) : 153 - 161
[10] L2NAS: Learning to Optimize Neural Architectures via Continuous-Action Reinforcement Learning
Mils, Keith G.
Han, Fred X.
Salameh, Mohammad
Rezaei, Seyed Saeed Changiz
Kong, Linglong
Lu, Wei
Lian, Shuo
Jui, Shangling
Niu, Di
[J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 1284 - 1293

← 1 2 3 4 5 →