A survey and comparative evaluation of actor-critic methods in process control

被引:20
|
作者
Dutta, Debaprasad [1 ]
Upreti, Simant R. [1 ]
机构
[1] Toronto Metropolitan Univ, Dept Chem Engn, Toronto, ON, Canada
来源
基金
加拿大自然科学与工程研究理事会;
关键词
actor-critic methods; process control; reinforcement learning; MODEL-PREDICTIVE CONTROL; LEARNING CONTROL; BATCH PROCESSES; NEURO-CONTROL; REINFORCEMENT; SYSTEM; PERFORMANCE; FRAMEWORK;
D O I
10.1002/cjce.24508
中图分类号
TQ [化学工业];
学科分类号
0817 ;
摘要
Actor-critic (AC) methods have emerged as an important class of reinforcement learning (RL) paradigm that enables model-free control by acting on a process and learning from the consequence. To that end, these methods utilize artificial neural networks, which are synergized for action evaluation and optimal action prediction. This feature is highly desirable for process control, especially when the knowledge about a process is limited or when it is susceptible to uncertainties. In this work, we summarize important concepts of AC methods and survey their process control applications. This treatment is followed by a comparative evaluation of the set-point tracking and robustness of controllers based on five prominent AC methods, namely, DDPG, TD3, SAC, PPO, and TRPO, in five case studies of varying process nonlinearity. The training demands and control performances indicate the superiority of DDPG and TD3 methods, which rely on off-policy, deterministic search for optimal action policies. Overall, the knowledge base and results of this work are expected to serve practitioners in their efforts toward further development of autonomous process control strategies.
引用
收藏
页码:2028 / 2056
页数:29
相关论文
共 50 条
  • [1] Efficient Model Learning Methods for Actor-Critic Control
    Grondman, Ivo
    Vaandrager, Maarten
    Busoniu, Lucian
    Babuska, Robert
    Schuitema, Erik
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2012, 42 (03): : 591 - 602
  • [2] Temporal Logic Motion Control using Actor-Critic Methods
    Ding, Xu Chu
    Wang, Jing
    Lahijanian, Morteza
    Paschalidis, Ioannis Ch
    Belta, Calin A.
    2012 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2012, : 4687 - 4692
  • [3] Temporal logic motion control using actor-critic methods
    Wang, Jing
    Ding, Xuchu
    Lahijanian, Morteza
    Paschalidis, Ioannis Ch.
    Belta, Calin A.
    INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2015, 34 (10): : 1329 - 1344
  • [4] Classical Actor-Critic Applied to the Control of a Self - Regulatory Process
    Bras, E. H.
    Louw, T. M.
    Bradshaw, S. M.
    IFAC PAPERSONLINE, 2023, 56 (02): : 7172 - 7177
  • [5] Actor-Critic Model Predictive Control
    Romero, Angel
    Song, Yunlong
    Scaramuzza, Davide
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2024), 2024, : 14777 - 14784
  • [6] TD-regularized actor-critic methods
    Simone Parisi
    Voot Tangkaratt
    Jan Peters
    Mohammad Emtiyaz Khan
    Machine Learning, 2019, 108 : 1467 - 1501
  • [7] TD-regularized actor-critic methods
    Parisi, Simone
    Tangkaratt, Voot
    Peters, Jan
    Khan, Mohammad Emtiyaz
    MACHINE LEARNING, 2019, 108 (8-9) : 1467 - 1501
  • [8] Actor-critic algorithms
    Konda, VR
    Tsitsiklis, JN
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 12, 2000, 12 : 1008 - 1014
  • [9] Mild Policy Evaluation for Offline Actor-Critic
    Huang, Longyang
    Dong, Botao
    Lu, Jinhui
    Zhang, Weidong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (12) : 17950 - 17964
  • [10] On actor-critic algorithms
    Konda, VR
    Tsitsiklis, JN
    SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2003, 42 (04) : 1143 - 1166