Funnel-Based Reward Shaping for Signal Temporal Logic Tasks in Reinforcement Learning

被引:2
|
作者
Saxena, Naman [1 ]
Gorantla, Sandeep [2 ]
Jagtap, Pushpak [2 ]
机构
[1] Indian Inst Sci, Dept Comp Sci & Automat, Bangalore 560012, India
[2] Indian Inst Sci, Robert Bosch Ctr Cyber Phys Syst, Bangalore 560012, India
关键词
Machine learning for robot control; reinforcement learning;
D O I
10.1109/LRA.2023.3341775
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Signal Temporal Logic (STL) is a powerful framework for describing the complex temporal and logical behaviour of the dynamical system. Numerous studies have attempted to employ reinforcement learning to learn a controller that enforces STL specifications; however, they have been unable to effectively tackle the challenges of ensuring robust satisfaction in continuous state space and maintaining tractability. In this letter, leveraging the concept of funnel functions, we propose a tractable reinforcement learning algorithm to learn a time-dependent policy for robust satisfaction of STL specification in continuous state space. We demonstrate the utility of our approach on several STL tasks using different environments.
引用
收藏
页码:1373 / 1379
页数:7
相关论文
共 50 条
  • [1] Reinforcement Learning for Signal Temporal Logic using Funnel-Based Approach
    Saxena, Naman
    Sandeep, Gorantla
    Jagtap, Pushpak
    [J]. 2023 NINTH INDIAN CONTROL CONFERENCE, ICC, 2023, : 1 - 6
  • [2] Temporal-Logic-Based Reward Shaping for Continuing Reinforcement Learning Tasks
    Jiang, Yuqian
    Bharadwaj, Suda
    Wu, Bo
    Shah, Rishi
    Topcu, Ufuk
    Stone, Peter
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 7995 - 8003
  • [3] Distributed Control using Reinforcement Learning with Temporal-Logic-Based Reward Shaping
    Zhang, Ningyuan
    Liu, Wenliang
    Belta, Calin
    [J]. LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 168, 2022, 168
  • [4] Tractable Reinforcement Learning for Signal Temporal Logic Tasks With Counterfactual Experience Replay
    Wang, Siqi
    Yin, Xunyuan
    Li, Shaoyuan
    Yin, Xiang
    [J]. IEEE CONTROL SYSTEMS LETTERS, 2024, 8 : 616 - 621
  • [5] Structured Reward Shaping using Signal Temporal Logic specifications
    Balakrishnan, Anand
    Deshmukh, Jyotirmoy
    [J]. 2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 3481 - 3486
  • [6] Funnel-based Cooperative Control of Leader-follower Multi-agent Systems under Signal Temporal Logic Specifications
    Chen, Fei
    Dimarogonas, Dimos, V
    [J]. 2022 EUROPEAN CONTROL CONFERENCE (ECC), 2022, : 906 - 911
  • [7] Reward Shaping Based Federated Reinforcement Learning
    Hu, Yiqiu
    Hua, Yun
    Liu, Wenyan
    Zhu, Jun
    [J]. IEEE ACCESS, 2021, 9 : 67259 - 67267
  • [8] Lifelong reinforcement learning with temporal logic formulas and reward machines
    Zheng, Xuejing
    Yu, Chao
    Zhang, Minjie
    [J]. KNOWLEDGE-BASED SYSTEMS, 2022, 257
  • [9] Plan-based Reward Shaping for Reinforcement Learning
    Grzes, Marek
    Kudenko, Daniel
    [J]. 2008 4TH INTERNATIONAL IEEE CONFERENCE INTELLIGENT SYSTEMS, VOLS 1 AND 2, 2008, : 416 - 423
  • [10] Potential Based Reward Shaping for Hierarchical Reinforcement Learning
    Gao, Yang
    Toni, Francesca
    [J]. PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 3504 - 3510