Trial and Error Using Previous Experiences as Simulation Models in Humanoid Motor Learning

被引:4
|
作者
Sugimoto, Norikazu [1 ]
Tangkaratt, Voot [2 ]
Wensveen, Thijs [3 ]
Zhao, Tingting [4 ]
Sugiyama, Masashi [2 ]
Morimoto, Jun [5 ]
机构
[1] Natl Inst Informat & Commun Technol, Osaka, Japan
[2] Univ Tokyo, Tokyo 1138654, Japan
[3] Delft Univ Technol, NL-2600 AA Delft, Netherlands
[4] Tianjin Univ Sci & Technol, Tianjin, Peoples R China
[5] ATR Computat Neurosci Labs, Kyoto, Japan
关键词
POLICY GRADIENTS; SAMPLE REUSE; ROBOTICS;
D O I
10.1109/MRA.2015.2511681
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Since biological systems have the ability to efficiently reuse previous experiences to change their behavioral strategies to avoid enemies or find food, the number of required samples from real environments to improve behavioral policy is greatly reduced. Even for real robotic systems, it is desirable to use only a limited number of samples from real environments due to the limited durability of real systems to reduce the required time to improve control performance. In this article, we used previous experiences as environmental local models so that the movement policy of a humanoid robot can be efficiently improved with a limited number of samples from its real environment. We applied our proposed learning method to a real humanoid robot and successfully achieve two challenging control tasks. We applied our proposed learning approach to acquire a policy for a cart-pole swing-up task in a real-virtual hybrid task environment, where the robot waves a PlayStation (PS) Move motion controller to move a cart-pole in a virtual simulator. Furthermore, we applied our proposed method to a challenging basketball-shooting task in a real environment. © 1994-2011 IEEE.
引用
收藏
页码:96 / 105
页数:10
相关论文
共 50 条
  • [31] Derivation of error models and error compensation procedure for simulation turntable using multi-body kinematics
    Qu, Zhiyong
    Yao, Yu
    2005 IEEE International Conference on Mechatronics and Automations, Vols 1-4, Conference Proceedings, 2005, : 1408 - 1411
  • [32] THE ACQUISITION OF NEW WORD MEANING BY AUDITORY-MOTOR ASSOCIATIONS IN A TRIAL-AND-ERROR LEARNING PARADIGM
    Chernyshev, B. V.
    Nikolaeva, A. Yu.
    Prokofyev, A. O.
    Razorenova, A. M.
    Tyulenev, N. B.
    Stroganova, T. A.
    PSYCHOLOGY-JOURNAL OF THE HIGHER SCHOOL OF ECONOMICS, 2018, 15 (02): : 257 - 267
  • [33] Error estimators for adaptive simulation of rarefied gases using hyperbolic moment models
    Koellermeier, Julian
    31ST INTERNATIONAL SYMPOSIUM ON RAREFIED GAS DYNAMICS (RGD31), 2019, 2132
  • [34] Using random number generation for resolving problems in trial-to-trial analysis in motor learning
    Panzer, S
    Naundorf, F
    Blischke, K
    JOURNAL OF SPORT & EXERCISE PSYCHOLOGY, 2001, 23 : S61 - S62
  • [35] Rapid trial-and-error learning with simulation supports flexible tool use and physical reasoning
    Allen, Kelsey R.
    Smith, Kevin A.
    Tenenbaum, Joshua B.
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2020, 117 (47) : 29302 - 29310
  • [36] Multivariate Streamflow Simulation Using Hybrid Deep Learning Models
    Wegayehu, Eyob Betru
    Muluneh, Fiseha Behulu
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2021, 2021
  • [37] Using Active Learning for Speeding up Calibration in Simulation Models
    Cevik, Mucahit
    Ergun, Mehmet Ali
    Stout, Natasha K.
    Trentham-Dietz, Amy
    Craven, Mark
    Alagoz, Oguzhan
    MEDICAL DECISION MAKING, 2016, 36 (05) : 581 - 593
  • [38] A DVFS Design and Simulation Framework Using Machine Learning Models
    Zhuo, Cheng
    Gao, Di
    Cao, Yuan
    Shen, Tianhao
    Zhang, Li
    Zhou, Jinfang
    Yin, Xunzhao
    IEEE DESIGN & TEST, 2023, 40 (01) : 52 - 61
  • [39] Dynamic simulation of a linear variable reluctance motor using coupled network models
    Chayopitak, N
    Taylor, DG
    PROCEEDINGS OF THE THIRTY-SIXTH SOUTHEASTERN SYMPOSIUM ON SYSTEM THEORY, 2004, : 160 - 164
  • [40] Variable Frame Rate Acoustic Models using Minimum Error Reinforcement Learning
    Jiang, Dongcheng
    Zhang, Chao
    Woodland, Philip C.
    INTERSPEECH 2021, 2021, : 2601 - 2605