Risk-averse supply chain management via robust reinforcement learning

被引:0
|
作者
Wang, Jing [1 ]
Swartz, Christopher L.E. [2 ]
Huang, Kai [3 ]
机构
[1] School of Computational Science and Engineering, McMaster University, 1280 Main Street West, Hamilton,ON,L8S 4K1, Canada
[2] Department of Chemical Engineering, McMaster University, 1280 Main Street West, Hamilton,ON,L8S 4L7, Canada
[3] DeGroote School of Business, McMaster University, 1280 Main Street West, Hamilton,ON,L8S 4M4, Canada
来源
基金
加拿大自然科学与工程研究理事会;
关键词
Contrastive Learning - Federated learning;
D O I
10.1016/j.compchemeng.2024.108912
中图分类号
学科分类号
摘要
Classical reinforcement learning (RL) may suffer performance degradation when the environment deviates from training conditions, limiting its application in risk-averse supply chain management. This work explores using robust RL in supply chain operations to hedge against environment inconsistencies and changes. Two robust RL algorithms, Qˆ-learning and β-pessimistic Q-learning, are examined against conventional Q-learning and a baseline order-up-to inventory policy. Furthermore, this work extends RL applications from forward to closed-loop supply chains. Two case studies are conducted using a supply chain simulator developed with agent-based modeling. The results show that Q-learning can outperform the baseline policy under normal conditions, but notably degrades under environment deviations. By comparison, the robust RL models tend to make more conservative inventory decisions to avoid large shortage penalties. Specifically, fine-tuned β-pessimistic Q-learning can achieve good performance under normal conditions and maintain robustness against moderate environment inconsistencies, making it suitable for risk-averse decision-making. © 2024 Elsevier Ltd
引用
收藏
相关论文
共 50 条
  • [1] Efficient Risk-Averse Reinforcement Learning
    Greenberg, Ido
    Chow, Yinlam
    Ghavamzadeh, Mohammad
    Mannor, Shie
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [2] Risk-Averse Model Uncertainty for Distributionally Robust Safe Reinforcement Learning
    Queeney, James
    Benosman, Mouhacine
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [3] Pricing Policy in Green Supply Chain Management with a Risk-Averse Retailer
    Li, Bo
    Jiang, Yushan
    Qu, Xiaolong
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING AND ENGINEERING MANAGEMENT (IEEM), 2017, : 393 - 397
  • [4] Risk-averse Reinforcement Learning for Algorithmic Trading
    Shen, Yun
    Huang, Ruihong
    Yan, Chang
    Obermayer, Klaus
    [J]. 2014 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE FOR FINANCIAL ENGINEERING & ECONOMICS (CIFER), 2014, : 391 - 398
  • [5] Risk-averse Reinforcement Learning for Portfolio Optimization
    Enkhsaikhan, Bayaraa
    Jo, Ohyun
    [J]. ICT EXPRESS, 2024, 10 (04): : 857 - 862
  • [6] A survey on risk-averse and robust revenue management
    Goensch, Jochen
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2017, 263 (02) : 337 - 348
  • [7] The coordination mechanism of a risk-averse green supply chain
    Wang, Yuhong
    Sheng, Xiaoqi
    Xie, Yudie
    [J]. CHINESE MANAGEMENT STUDIES, 2024, 18 (01) : 174 - 195
  • [8] Coordination of a risk-averse supply chain with price competition
    Tian, Yu
    Huang, Dao
    Liu, He
    [J]. ICIEA 2007: 2ND IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS, VOLS 1-4, PROCEEDINGS, 2007, : 1415 - 1420
  • [9] Strategy of supply chain coordination with risk-averse bias
    School of Economics and Management, Southeast University, Nanjing 210096, China
    不详
    [J]. Dongnan Daxue Xuebao, 2007, SUPPL. 2 (337-342):
  • [10] Risk-Averse Reinforcement Learning via Dynamic Time-Consistent Risk Measures
    Yu, Xian
    Shen, Siqian
    [J]. 2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 2307 - 2312