A survey and comparative evaluation of actor-critic methods in process control

被引:20
|
作者
Dutta, Debaprasad [1 ]
Upreti, Simant R. [1 ]
机构
[1] Toronto Metropolitan Univ, Dept Chem Engn, Toronto, ON, Canada
来源
基金
加拿大自然科学与工程研究理事会;
关键词
actor-critic methods; process control; reinforcement learning; MODEL-PREDICTIVE CONTROL; LEARNING CONTROL; BATCH PROCESSES; NEURO-CONTROL; REINFORCEMENT; SYSTEM; PERFORMANCE; FRAMEWORK;
D O I
10.1002/cjce.24508
中图分类号
TQ [化学工业];
学科分类号
0817 ;
摘要
Actor-critic (AC) methods have emerged as an important class of reinforcement learning (RL) paradigm that enables model-free control by acting on a process and learning from the consequence. To that end, these methods utilize artificial neural networks, which are synergized for action evaluation and optimal action prediction. This feature is highly desirable for process control, especially when the knowledge about a process is limited or when it is susceptible to uncertainties. In this work, we summarize important concepts of AC methods and survey their process control applications. This treatment is followed by a comparative evaluation of the set-point tracking and robustness of controllers based on five prominent AC methods, namely, DDPG, TD3, SAC, PPO, and TRPO, in five case studies of varying process nonlinearity. The training demands and control performances indicate the superiority of DDPG and TD3 methods, which rely on off-policy, deterministic search for optimal action policies. Overall, the knowledge base and results of this work are expected to serve practitioners in their efforts toward further development of autonomous process control strategies.
引用
收藏
页码:2028 / 2056
页数:29
相关论文
共 50 条
  • [21] Actor-Critic Reinforcement Learning for Control With Stability Guarantee
    Han, Minghao
    Zhang, Lixian
    Wang, Jun
    Pan, Wei
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (04) : 6217 - 6224
  • [22] Variational actor-critic algorithms*,**
    Zhu, Yuhua
    Ying, Lexing
    ESAIM-CONTROL OPTIMISATION AND CALCULUS OF VARIATIONS, 2023, 29
  • [23] Error controlled actor-critic
    Gao, Xingen
    Chao, Fei
    Zhou, Changle
    Ge, Zhen
    Yang, Longzhi
    Chang, Xiang
    Shang, Changjing
    Shen, Qiang
    INFORMATION SCIENCES, 2022, 612 : 62 - 74
  • [24] A Hessian Actor-Critic Algorithm
    Wang, Jing
    Paschalidis, Ioannis Ch
    2014 IEEE 53RD ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2014, : 1131 - 1136
  • [25] Learning State Representation for Deep Actor-Critic Control
    Munk, Jelle
    Kober, Jens
    Babuska, Robert
    2016 IEEE 55TH CONFERENCE ON DECISION AND CONTROL (CDC), 2016, : 4667 - 4673
  • [26] Natural actor-critic algorithms
    Bhatnagar, Shalabh
    Sutton, Richard S.
    Ghavamzadeh, Mohammad
    Lee, Mark
    AUTOMATICA, 2009, 45 (11) : 2471 - 2482
  • [27] Actor-Critic Instance Segmentation
    Araslanov, Nikita
    Rothkopf, Constantin A.
    Roth, Stefan
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 8229 - 8238
  • [28] Least Squares Temporal Difference Actor-Critic Methods with Applications to Robot Motion Control
    Estanjini, Reza Moazzez
    Ding, Xu Chu
    Lahijanian, Morteza
    Wang, Jing
    Belta, Calin A.
    Paschalidis, Ioannis Ch.
    2011 50TH IEEE CONFERENCE ON DECISION AND CONTROL AND EUROPEAN CONTROL CONFERENCE (CDC-ECC), 2011, : 704 - 709
  • [29] Actor-Critic or Critic-Actor? A Tale of Two Time Scales
    Bhatnagar, Shalabh
    Borkar, Vivek S.
    Guin, Soumyajit
    IEEE CONTROL SYSTEMS LETTERS, 2023, 7 : 2671 - 2676
  • [30] Noisy Importance Sampling Actor-Critic: An Off-Policy Actor-Critic With Experience Replay
    Tasfi, Norman
    Capretz, Miriam
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,