On Normative Reinforcement Learning via Safe Reinforcement Learning

被引:0
|
作者
Neufeld, Emery A. [1 ]
Bartocci, Ezio [1 ]
Ciabattoni, Agata [1 ]
机构
[1] TU Wien, Vienna, Austria
关键词
D O I
10.1007/978-3-031-21203-1_5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement learning (RL) has proven a successful technique for teaching autonomous agents goal-directed behaviour. As RL agents further integrate with our society, they must learn to comply with ethical, social, or legal norms. Defeasible deontic logics are natural formal frameworks to specify and reason about such norms in a transparent way. However, their effective and efficient integration in RL agents remains an open problem. On the other hand, linear temporal logic (LTL) has been successfully employed to synthesize RL policies satisfying, e.g., safety requirements. In this paper, we investigate the extent to which the established machinery for safe reinforcement learning can be leveraged for directing normative behaviour for RL agents. We analyze some of the difficulties that arise from attempting to represent norms with LTL, provide an algorithm for synthesizing LTL specifications from certain normative systems, and analyze its power and limits with a case study.
引用
收藏
页码:72 / 89
页数:18
相关论文
共 50 条
  • [1] Safe Reinforcement Learning via Shielding
    Alshiekh, Mohammed
    Bloem, Roderick
    Ehlers, Ruediger
    Koenighofer, Bettina
    Niekum, Scott
    Topcu, Ufuk
    [J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 2669 - 2678
  • [2] Safe Reinforcement Learning via Curriculum Induction
    Turchetta, Matteo
    Kolobov, Andrey
    Shah, Shital
    Krause, Andreas
    Agarwal, Alekh
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [3] Multi-objective safe reinforcement learning: the relationship between multi-objective reinforcement learning and safe reinforcement learning
    Horie, Naoto
    Matsui, Tohgoroh
    Moriyama, Koichi
    Mutoh, Atsuko
    Inuzuka, Nobuhiro
    [J]. ARTIFICIAL LIFE AND ROBOTICS, 2019, 24 (03) : 352 - 359
  • [4] Multi-objective safe reinforcement learning: the relationship between multi-objective reinforcement learning and safe reinforcement learning
    Naoto Horie
    Tohgoroh Matsui
    Koichi Moriyama
    Atsuko Mutoh
    Nobuhiro Inuzuka
    [J]. Artificial Life and Robotics, 2019, 24 : 352 - 359
  • [5] Safe Reinforcement Learning via Probabilistic Logic Shields
    Yang, Wen-Chi
    Marra, Giuseppe
    Rens, Gavin
    De Raedt, Luc
    [J]. PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 5739 - 5749
  • [6] Safe HVAC Control via Batch Reinforcement Learning
    Liu, Hsin-Yu
    Balaji, Bharathan
    Gao, Sicun
    Gupta, Rajesh
    Hong, Dezhi
    [J]. 2022 13TH ACM/IEEE INTERNATIONAL CONFERENCE ON CYBER-PHYSICAL SYSTEMS (ICCPS 2022), 2022, : 181 - 192
  • [7] Safe Reinforcement Learning: A Survey
    Wang X.-S.
    Wang R.-R.
    Cheng Y.-H.
    [J]. Zidonghua Xuebao/Acta Automatica Sinica, 2023, 49 (09): : 1813 - 1835
  • [8] A Normative Supervisor for Reinforcement Learning Agents
    Neufeld, Emery
    Bartocci, Ezio
    Ciabattoni, Agata
    Governatori, Guido
    [J]. AUTOMATED DEDUCTION, CADE 28, 2021, 12699 : 565 - 576
  • [9] Safe Reinforcement Learning via Formal Methods: Toward Safe Control Through Proof and Learning
    Fulton, Nathan
    Platzer, Andre
    [J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 6485 - 6492
  • [10] Safe Reinforcement Learning via Statistical Model Predictive Shielding
    Bastani, Osbert
    Li, Shuo
    Xu, Anton
    [J]. Robotics: Science and Systems, 2021,