Model-Free Reinforcement Learning of Impedance Control in Stochastic Environments

被引:38
|
作者
Stulp, Freek [1 ,2 ,3 ]
Buchli, Jonas [1 ,4 ]
Ellmer, Alice [1 ]
Mistry, Michael [5 ]
Theodorou, Evangelos A. [1 ]
Schaal, Stefan [1 ]
机构
[1] Univ So Calif, Computat Learning & Motor Control Lab, Los Angeles, CA 90089 USA
[2] ParisTech, Ecole Natl Superi Tech Avancees, F-75015 Paris, France
[3] INRIA Bordeaux Sud Ouest, FLOWERS Res Team, F-33405 Talence, France
[4] Ist Italiano Tecnol, Dept Adv Robot, I-16163 Genoa, Italy
[5] Univ Birmingham, Sch Comp Sci, Birmingham B15 2TT, W Midlands, England
基金
美国国家科学基金会; 瑞士国家科学基金会;
关键词
Force field experiments; motion primitives; motor system and development; reinforcement learning; robots with development and learning skills; using robots to study development and learning; variable impedance control;
D O I
10.1109/TAMD.2012.2205924
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For humans and robots, variable impedance control is an essential component for ensuring robust and safe physical interaction with the environment. Humans learn to adapt their impedance to specific tasks and environments; a capability which we continually develop and improve until we are well into our twenties. In this article, we reproduce functionally interesting aspects of learning impedance control in humans on a simulated robot platform. As demonstrated in numerous force field tasks, humans combine two strategies to adapt their impedance to perturbations, thereby minimizing position error and energy consumption: 1) if perturbations are unpredictable, subjects increase their impedance through cocontraction; and 2) if perturbations are predictable, subjects learn a feed-forward command to offset the perturbation. We show how a 7-DOF simulated robot demonstrates similar behavior with our model-free reinforcement learning algorithm PI2, by applying deterministic and stochastic force fields to the robot's end-effector. We show the qualitative similarity between the robot and human movements. Our results provide a biologically plausible approach to learning appropriate impedances purely from experience, without requiring a model of either body or environment dynamics. Not requiring models also facilitates autonomous development for robots, as prespecified models cannot be provided for each environment a robot might encounter.
引用
收藏
页码:330 / 341
页数:12
相关论文
共 50 条
  • [41] Experimental evaluation of model-free reinforcement learning algorithms for continuous HVAC control
    Biemann, Marco
    Scheller, Fabian
    Liu, Xiufeng
    Huang, Lizhen
    [J]. APPLIED ENERGY, 2021, 298
  • [42] Policy Learning with Constraints in Model-free Reinforcement Learning: A Survey
    Liu, Yongshuai
    Halev, Avishai
    Liu, Xin
    [J]. PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 4508 - 4515
  • [43] Reinforcement Learning of Impedance Control In Stochastic Force Fields
    Stulp, Freek
    Buchli, Jonas
    Ellmer, Alice
    Mistry, Michael
    Theodorou, Evangelos
    Schaal, Stefan
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON DEVELOPMENT AND LEARNING (ICDL), 2011,
  • [44] Improving Optimistic Exploration in Model-Free Reinforcement Learning
    Grzes, Marek
    Kudenko, Daniel
    [J]. ADAPTIVE AND NATURAL COMPUTING ALGORITHMS, 2009, 5495 : 360 - 369
  • [45] Constrained model-free reinforcement learning for process optimization
    Pan, Elton
    Petsagkourakis, Panagiotis
    Mowbray, Max
    Zhang, Dongda
    del Rio-Chanona, Ehecatl Antonio
    [J]. COMPUTERS & CHEMICAL ENGINEERING, 2021, 154
  • [46] Model-Free Preference-Based Reinforcement Learning
    Wirth, Christian
    Fuernkranz, Johannes
    Neumann, Gerhard
    [J]. THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 2222 - 2228
  • [47] Model-Free μ Synthesis via Adversarial Reinforcement Learning
    Keivan, Darioush
    Havens, Aaron
    Seiler, Peter
    Dullerud, Geir
    Hu, Bin
    [J]. 2022 AMERICAN CONTROL CONFERENCE, ACC, 2022, : 3335 - 3341
  • [48] An adaptive clustering method for model-free reinforcement learning
    Matt, A
    Regensburger, G
    [J]. INMIC 2004: 8TH INTERNATIONAL MULTITOPIC CONFERENCE, PROCEEDINGS, 2004, : 362 - 367
  • [49] Model-Free Reinforcement Learning for Mean Field Games
    Mishra, Rajesh
    Vasal, Deepanshu
    Vishwanath, Sriram
    [J]. IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2023, 10 (04): : 2141 - 2151
  • [50] Counterfactual Credit Assignment in Model-Free Reinforcement Learning
    Mesnard, Thomas
    Weber, Theophane
    Viola, Fabio
    Thakoor, Shantanu
    Saade, Alaa
    Harutyunyan, Anna
    Dabney, Will
    Stepleton, Tom
    Heess, Nicolas
    Guez, Arthur
    Moulines, Eric
    Hutter, Marcus
    Buesing, Lars
    Munos, Remi
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139