An Actor-Critic Algorithm for the Stochastic Cutting Stock Problem

被引:0
|
作者
Su, Jie-Ying [1 ]
Kang, Jia-Lin [2 ]
Jang, Shi-Shang [1 ]
机构
[1] Natl Tsing Hua Univ, Dept Chem Engn, Hsinchu 300, Taiwan
[2] Natl Yunlin Univ Sci & Technol, Dept Chem & Mat Engn, Yunlin 64002, Taiwan
关键词
reinforcement learning; stochastic cutting stock problem; advantage actor-critic; discount factor; continuous action space;
D O I
10.3390/pr11041203
中图分类号
TQ [化学工业];
学科分类号
0817 ;
摘要
The inventory level has a significant influence on the cost of process scheduling. The stochastic cutting stock problem (SCSP) is a complicated inventory-level scheduling problem due to the existence of random variables. In this study, we applied a model-free on-policy reinforcement learning (RL) approach based on a well-known RL method, called the Advantage Actor-Critic, to solve a SCSP example. To achieve the two goals of our RL model, namely, avoiding violating the constraints and minimizing cost, we proposed a two-stage discount factor algorithm to balance these goals during different training stages and adopted the game concept of an episode ending when an action violates any constraint. Experimental results demonstrate that our proposed method obtains solutions with low costs and is good at continuously generating actions that satisfy the constraints. Additionally, the two-stage discount factor algorithm trained the model faster while maintaining a good balance between the two aforementioned goals.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] A Hessian Actor-Critic Algorithm
    Wang, Jing
    Paschalidis, Ioannis Ch
    [J]. 2014 IEEE 53RD ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2014, : 1131 - 1136
  • [2] An Actor-Critic Algorithm With Second-Order Actor and Critic
    Wang, Jing
    Paschalidis, Ioannis Ch.
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2017, 62 (06) : 2689 - 2703
  • [3] The actor-critic algorithm as multi-time-scale stochastic approximation
    Vivek S Borkar
    Vijaymohan R Konda
    [J]. Sadhana, 1997, 22 : 525 - 543
  • [4] The actor-critic algorithm as multi-time-scale stochastic approximation
    Borkar, VS
    Konda, VR
    [J]. SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 1997, 22 (4): : 525 - 543
  • [5] An Actor-Critic Algorithm for SVM Hyperparameters
    Kim, Chayoung
    Park, Jung-min
    Kim, Hye-young
    [J]. INFORMATION SCIENCE AND APPLICATIONS 2018, ICISA 2018, 2019, 514 : 653 - 661
  • [6] Application of actor-critic learning algorithm for optimal bidding problem of a Genco
    Gajjar, GR
    Khaparde, SA
    Nagaraju, P
    Soman, SA
    [J]. 2003 IEEE POWER ENGINEERING SOCIETY GENERAL MEETING, VOLS 1-4, CONFERENCE PROCEEDINGS, 2003, : 818 - 818
  • [7] Application of actor-critic learning algorithm for optimal bidding problem of a Genco
    Gajjar, GR
    Khaparde, SA
    Nagaraju, P
    Soman, SA
    [J]. IEEE TRANSACTIONS ON POWER SYSTEMS, 2003, 18 (01) : 11 - 18
  • [8] A Finite Sample Analysis of the Actor-Critic Algorithm
    Yang, Zhuoran
    Zhang, Kaiqing
    Hong, Mingyi
    Basar, Tamer
    [J]. 2018 IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2018, : 2759 - 2764
  • [9] Actor-Critic Algorithm with Transition Cost Estimation
    Sergey, Denisov
    Lee, Jee-Hyong
    [J]. INTERNATIONAL JOURNAL OF FUZZY LOGIC AND INTELLIGENT SYSTEMS, 2016, 16 (04) : 270 - 275
  • [10] The Effect of Discounting Actor-loss in Actor-Critic Algorithm
    Yaputra, Jordi
    Suyanto, Suyanto
    [J]. 2021 4TH INTERNATIONAL SEMINAR ON RESEARCH OF INFORMATION TECHNOLOGY AND INTELLIGENT SYSTEMS (ISRITI 2021), 2020,