Reinforcement learning with model-based feedforward inputs for robotic table tennis

被引:0
|
作者
Hao Ma
Dieter Büchler
Bernhard Schölkopf
Michael Muehlebach
机构
[1] Max Planck Institute for Intelligent Systems,Learning and Dynamical Systems
[2] Max Planck Institute for Intelligent Systems,Empirical Inference
来源
Autonomous Robots | 2023年 / 47卷
关键词
Reinforcement learning; Iterative learning control; Supervised learning; Table tennis robot; Pneumatic artificial muscle; Soft robotics;
D O I
暂无
中图分类号
学科分类号
摘要
We rethink the traditional reinforcement learning approach, which is based on optimizing over feedback policies, and propose a new framework that optimizes over feedforward inputs instead. This not only mitigates the risk of destabilizing the system during training but also reduces the bulk of the learning to a supervised learning task. As a result, efficient and well-understood supervised learning techniques can be applied and are tuned using a validation data set. The labels are generated with a variant of iterative learning control, which also includes prior knowledge about the underlying dynamics. Our framework is applied for intercepting and returning ping-pong balls that are played to a four-degrees-of-freedom robotic arm in real-world experiments. The robot arm is driven by pneumatic artificial muscles, which makes the control and learning tasks challenging. We highlight the potential of our framework by comparing it to a reinforcement learning approach that optimizes over feedback policies. We find that our framework achieves a higher success rate for the returns (100%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$100\%$$\end{document} vs. 96%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$96\%$$\end{document}, on 107 consecutive trials, see https://youtu.be/kR9jowEH7PY) while requiring only about one tenth of the samples during training. We also find that our approach is able to deal with a variant of different incoming trajectories.
引用
收藏
页码:1387 / 1403
页数:16
相关论文
共 50 条
  • [31] A comparison of direct and model-based reinforcement learning
    Atkeson, CG
    Santamaria, JC
    [J]. 1997 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION - PROCEEDINGS, VOLS 1-4, 1997, : 3557 - 3564
  • [32] Model-based average reward reinforcement learning
    Tadepalli, P
    Ok, D
    [J]. ARTIFICIAL INTELLIGENCE, 1998, 100 (1-2) : 177 - 224
  • [33] MOReL: Model-Based Offline Reinforcement Learning
    Kidambi, Rahul
    Rajeswaran, Aravind
    Netrapalli, Praneeth
    Joachims, Thorsten
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [34] Continual Model-Based Reinforcement Learning with Hypernetworks
    Huang, Yizhou
    Xie, Kevin
    Bharadhwaj, Homanga
    Shkurti, Florian
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 799 - 805
  • [35] Model-Based Reinforcement Learning in Robotics: A Survey
    Sun S.
    Lan X.
    Zhang H.
    Zheng N.
    [J]. Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2022, 35 (01): : 1 - 16
  • [36] Model-based Reinforcement Learning and the Eluder Dimension
    Osband, Ian
    Van Roy, Benjamin
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
  • [37] Transferring Instances for Model-Based Reinforcement Learning
    Taylor, Matthew E.
    Jong, Nicholas K.
    Stone, Peter
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PART II, PROCEEDINGS, 2008, 5212 : 488 - 505
  • [38] Consistency of Fuzzy Model-Based Reinforcement Learning
    Busoniu, Lucian
    Ernst, Damien
    De Schutter, Bart
    Babuska, Robert
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-5, 2008, : 518 - +
  • [39] Abstraction Selection in Model-Based Reinforcement Learning
    Jiang, Nan
    Kulesza, Alex
    Singh, Satinder
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 179 - 188
  • [40] Asynchronous Methods for Model-Based Reinforcement Learning
    Zhang, Yunzhi
    Clavera, Ignasi
    Tsai, Boren
    Abbeel, Pieter
    [J]. CONFERENCE ON ROBOT LEARNING, VOL 100, 2019, 100