Safe Reinforcement Learning: A Survey

被引:0
|
作者
Wang, Xue-Song [1 ]
Wang, Rong-Rong [1 ]
Cheng, Yu-Hu [1 ]
机构
[1] School of Information and Control Engineering, China University of Mining and Technology, Xuzhou,221116, China
来源
基金
中国国家自然科学基金;
关键词
Learning algorithms - Markov processes - Process control - Robots - Safety factor;
D O I
10.16383/j.aas.c220631
中图分类号
学科分类号
摘要
Reinforcement learning (RL) has proved a prominent success in the game of Go, video games, navigation, recommendation systems and other fields. However, a large number of reinforcement learning algorithms cannot be directly transplanted to real physical environment. This is because in the simulation scenario, the agent is able to interact with the environment in a trial-and-error manner to learn the optimal policy. Considering the safety of systems, many real-world applications require the limitation of random exploration behavior of agents. Hence, safety has become an essential factor for reinforcement learning from simulation to reality. In recent years, many researches have been devoted to develope safe reinforcement learning (SRL) algorithms that satisfy safety constraints while ensuring system performance. This paper presents a comprehensive survey of existing SRL algorithms, which are divided into three categories: Modification of learning process, modification of learning objective, and offline reinforcement learning. Furthermore, five experimental platforms are introduced, including Safety Gym, safe-control-gym, SafeRL-Kit, D4RL, and NeoRL. Lastly, the applications of SRL in the fields of autonomous driving, robot control, industrial process control, power system optimization, and healthcare are summarized, and the conclusion and perspective are briefly drawn. © 2023 Science Press. All rights reserved.
引用
收藏
页码:1813 / 1835
相关论文
共 50 条
  • [1] A Comprehensive Survey on Safe Reinforcement Learning
    Garcia, Javier
    Fernandez, Fernando
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2015, 16 : 1437 - 1480
  • [2] State-wise Safe Reinforcement Learning: A Survey
    Zhao, Weiye
    He, Tairan
    Chen, Rui
    Wei, Tianhao
    Liu, Changliu
    [J]. PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 6814 - 6822
  • [3] Safe reinforcement learning and its applications in robotics: A survey
    Zhang, Chang-Xin
    Zhang, Xing-Long
    Xu, Xin
    Lu, Yang
    [J]. Kongzhi Lilun Yu Yingyong/Control Theory and Applications, 2023, 40 (12): : 2090 - 2103
  • [4] On Normative Reinforcement Learning via Safe Reinforcement Learning
    Neufeld, Emery A.
    Bartocci, Ezio
    Ciabattoni, Agata
    [J]. PRIMA 2022: PRINCIPLES AND PRACTICE OF MULTI-AGENT SYSTEMS, 2023, 13753 : 72 - 89
  • [5] Multi-objective safe reinforcement learning: the relationship between multi-objective reinforcement learning and safe reinforcement learning
    Horie, Naoto
    Matsui, Tohgoroh
    Moriyama, Koichi
    Mutoh, Atsuko
    Inuzuka, Nobuhiro
    [J]. ARTIFICIAL LIFE AND ROBOTICS, 2019, 24 (03) : 352 - 359
  • [6] Multi-objective safe reinforcement learning: the relationship between multi-objective reinforcement learning and safe reinforcement learning
    Naoto Horie
    Tohgoroh Matsui
    Koichi Moriyama
    Atsuko Mutoh
    Nobuhiro Inuzuka
    [J]. Artificial Life and Robotics, 2019, 24 : 352 - 359
  • [7] Reinforcement learning: A survey
    Kaelbling, LP
    Littman, ML
    Moore, AW
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1996, 4 : 237 - 285
  • [8] Safe Reinforcement Learning for Sepsis Treatment
    Jia, Yan
    Burden, John
    Lawton, Tom
    Habli, Ibrahim
    [J]. 2020 8TH IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2020), 2020, : 108 - 114
  • [9] What Is Acceptably Safe for Reinforcement Learning?
    Bragg, John
    Habli, Ibrahim
    [J]. COMPUTER SAFETY, RELIABILITY, AND SECURITY, SAFECOMP 2018, 2018, 11094 : 418 - 430
  • [10] Safe reinforcement learning for dynamical games
    Yang, Yongliang
    Vamvoudakis, Kyriakos G.
    Modares, Hamidreza
    [J]. INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2020, 30 (09) : 3706 - 3726