A survey and comparative evaluation of actor-critic methods in process control

被引：20

作者：

Dutta, Debaprasad ^{[1
]}

Upreti, Simant R. ^{[1
]}

机构：

[1] Toronto Metropolitan Univ, Dept Chem Engn, Toronto, ON, Canada

来源：

CANADIAN JOURNAL OF CHEMICAL ENGINEERING | 2022年 / 100卷 / 09期

基金：

加拿大自然科学与工程研究理事会;

关键词：

actor-critic methods; process control; reinforcement learning; MODEL-PREDICTIVE CONTROL; LEARNING CONTROL; BATCH PROCESSES; NEURO-CONTROL; REINFORCEMENT; SYSTEM; PERFORMANCE; FRAMEWORK;

D O I：

10.1002/cjce.24508

中图分类号：

TQ [化学工业];

学科分类号：

0817 ;

摘要：

Actor-critic (AC) methods have emerged as an important class of reinforcement learning (RL) paradigm that enables model-free control by acting on a process and learning from the consequence. To that end, these methods utilize artificial neural networks, which are synergized for action evaluation and optimal action prediction. This feature is highly desirable for process control, especially when the knowledge about a process is limited or when it is susceptible to uncertainties. In this work, we summarize important concepts of AC methods and survey their process control applications. This treatment is followed by a comparative evaluation of the set-point tracking and robustness of controllers based on five prominent AC methods, namely, DDPG, TD3, SAC, PPO, and TRPO, in five case studies of varying process nonlinearity. The training demands and control performances indicate the superiority of DDPG and TD3 methods, which rely on off-policy, deterministic search for optimal action policies. Overall, the knowledge base and results of this work are expected to serve practitioners in their efforts toward further development of autonomous process control strategies.

引用

页码：2028 / 2056

页数：29

共 50 条

[21] Actor-Critic Reinforcement Learning for Control With Stability Guarantee
Han, Minghao
Zhang, Lixian
Wang, Jun
Pan, Wei
IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (04) : 6217 - 6224
[22] Variational actor-critic algorithms*,**
Zhu, Yuhua
Ying, Lexing
ESAIM-CONTROL OPTIMISATION AND CALCULUS OF VARIATIONS, 2023, 29
[23] Error controlled actor-critic
Gao, Xingen
Chao, Fei
Zhou, Changle
Ge, Zhen
Yang, Longzhi
Chang, Xiang
Shang, Changjing
Shen, Qiang
INFORMATION SCIENCES, 2022, 612 : 62 - 74
[24] A Hessian Actor-Critic Algorithm
Wang, Jing
Paschalidis, Ioannis Ch
2014 IEEE 53RD ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2014, : 1131 - 1136
[25] Learning State Representation for Deep Actor-Critic Control
Munk, Jelle
Kober, Jens
Babuska, Robert
2016 IEEE 55TH CONFERENCE ON DECISION AND CONTROL (CDC), 2016, : 4667 - 4673
[26] Natural actor-critic algorithms
Bhatnagar, Shalabh
Sutton, Richard S.
Ghavamzadeh, Mohammad
Lee, Mark
AUTOMATICA, 2009, 45 (11) : 2471 - 2482
[27] Actor-Critic Instance Segmentation
Araslanov, Nikita
Rothkopf, Constantin A.
Roth, Stefan
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 8229 - 8238
[28] Least Squares Temporal Difference Actor-Critic Methods with Applications to Robot Motion Control
Estanjini, Reza Moazzez
Ding, Xu Chu
Lahijanian, Morteza
Wang, Jing
Belta, Calin A.
Paschalidis, Ioannis Ch.
2011 50TH IEEE CONFERENCE ON DECISION AND CONTROL AND EUROPEAN CONTROL CONFERENCE (CDC-ECC), 2011, : 704 - 709
[29] Actor-Critic or Critic-Actor? A Tale of Two Time Scales
Bhatnagar, Shalabh
Borkar, Vivek S.
Guin, Soumyajit
IEEE CONTROL SYSTEMS LETTERS, 2023, 7 : 2671 - 2676
[30] Noisy Importance Sampling Actor-Critic: An Off-Policy Actor-Critic With Experience Replay
Tasfi, Norman
Capretz, Miriam
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,

← 1 2 3 4 5 →