Reward Shaping from Hybrid Systems Models in Reinforcement Learning

被引:2
|
作者
Qian, Marian [1 ]
Mitsch, Stefan [1 ]
机构
[1] Carnegie Mellon Univ, Comp Sci Dept, Pittsburgh, PA 15213 USA
来源
NASA FORMAL METHODS, NFM 2023 | 2023年 / 13903卷
关键词
Theorem proving; Differential dynamic logic; Hybrid systems; Reinforcement learning; Reward shaping; ROBUSTNESS;
D O I
10.1007/978-3-031-33170-1_8
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Reinforcement learning is increasingly often used as a learning technique to implement control tasks in autonomous systems. To meet stringent safety requirements, formal methods for learning-enabled systems, such as closed-loop neural network verification, shielding, falsification, and online reachability analysis, analyze learned controllers for safety violations. Besides filtering unsafe actions during training, these approaches view verification and training largely as separate tasks. We propose an approach based on logically constrained reinforcement learning to couple formal methods and reinforcement learning more tightly by generating safety-oriented aspects of reward functions from verified hybrid systems models. We demonstrate the approach on a standard reinforcement learning environment for longitudinal vehicle control.
引用
收藏
页码:122 / 139
页数:18
相关论文
共 50 条
  • [1] Reinforcement Learning with Reward Shaping and Hybrid Exploration in Sparse Reward Scenes
    Yang, Yulong
    Cao, Weihua
    Guo, Linwei
    Gan, Chao
    Wu, Min
    [J]. 2023 IEEE 6TH INTERNATIONAL CONFERENCE ON INDUSTRIAL CYBER-PHYSICAL SYSTEMS, ICPS, 2023,
  • [2] Belief Reward Shaping in Reinforcement Learning
    Marom, Ofir
    Rosman, Benjamin
    [J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 3762 - 3769
  • [3] Multigrid Reinforcement Learning with Reward Shaping
    Grzes, Marek
    Kudenko, Daniel
    [J]. ARTIFICIAL NEURAL NETWORKS - ICANN 2008, PT I, 2008, 5163 : 357 - 366
  • [4] Reward Shaping in Episodic Reinforcement Learning
    Grzes, Marek
    [J]. AAMAS'17: PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2017, : 565 - 573
  • [5] Reward Shaping for Reinforcement Learning by Emotion Expressions
    Hwang, K. S.
    Ling, J. L.
    Chen, Yu-Ying
    Wang, Wei-Han
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2014, : 1288 - 1293
  • [6] Hindsight Reward Shaping in Deep Reinforcement Learning
    de Villiers, Byron
    Sabatta, Deon
    [J]. 2020 INTERNATIONAL SAUPEC/ROBMECH/PRASA CONFERENCE, 2020, : 653 - 659
  • [7] Reward Shaping Based Federated Reinforcement Learning
    Hu, Yiqiu
    Hua, Yun
    Liu, Wenyan
    Zhu, Jun
    [J]. IEEE ACCESS, 2021, 9 : 67259 - 67267
  • [8] Hybrid Reward Architecture for Reinforcement Learning
    van Seijen, Harm
    Fatemi, Mehdi
    Romoff, Joshua
    Laroche, Romain
    Barnes, Tavian
    Tsang, Jeffrey
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [9] Using Natural Language for Reward Shaping in Reinforcement Learning
    Goyal, Prasoon
    Niekum, Scott
    Mooney, Raymond J.
    [J]. PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 2385 - 2391
  • [10] Plan-based Reward Shaping for Reinforcement Learning
    Grzes, Marek
    Kudenko, Daniel
    [J]. 2008 4TH INTERNATIONAL IEEE CONFERENCE INTELLIGENT SYSTEMS, VOLS 1 AND 2, 2008, : 416 - 423