Generalized Model Learning for Reinforcement Learning on a Humanoid Robot

被引:42
|
作者
Hester, Todd [1 ]
Quinlan, Michael [1 ]
Stone, Peter [1 ]
机构
[1] Univ Texas Austin, Dept Comp Sci, Austin, TX 78712 USA
基金
美国国家科学基金会;
关键词
D O I
10.1109/ROBOT.2010.5509181
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Reinforcement learning (RL) algorithms have long been promising methods for enabling an autonomous robot to improve its behavior on sequential decision-making tasks. The obvious enticement is that the robot should be able to improve its own behavior without the need for detailed step-by-step programming. However, for RL to reach its full potential, the algorithms must be sample efficient: they must learn competent behavior from very few real-world trials. From this perspective, model-based methods, which use experiential data more efficiently than model-free approaches, are appealing. But they often require exhaustive exploration to learn an accurate model of the domain. In this paper, we present an algorithm, Reinforcement Learning with Decision Trees (RLDT), that uses decision trees to learn the model by generalizing the relative effect of actions across states. The agent explores the environment until it believes it has a reasonable policy. The combination of the learning approach with the targeted exploration policy enables fast learning of the model. We compare RL-DT against standard model-free and model-based learning methods, and demonstrate its effectiveness on an Aldebaran Nao humanoid robot scoring goals in a penalty kick scenario.
引用
收藏
页码:2369 / 2374
页数:6
相关论文
共 50 条
  • [1] Humanoid robot control based on reinforcement learning
    [J]. Iida, S. (iida@ics.nitech.ac.jp), IEEE Robotics and Automation Society; Nagoya University, Japan; City of Nagoya, Japan; Nagoya City Science Museum; Chubu Science and Technology Center (Institute of Electrical and Electronics Engineers Inc.):
  • [2] Deep Reinforcement Learning for Humanoid Robot Behaviors
    Muzio, Alexandre F. V.
    Maximo, Marcos R. O. A.
    Yoneyama, Takashi
    [J]. JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2022, 105 (01)
  • [3] Humanoid robot control based on reinforcement learning
    Iida, S
    Kuwayama, K
    Kanoh, M
    Kato, S
    Kunitachi, T
    Itoh, H
    [J]. PROCEEDINGS OF THE 2004 INTERNATIONAL SYMPOSIUM ON MICRO-NANOMECHATRONICS AND HUMAN SCIENCE, 2004, : 353 - 358
  • [4] Deep Reinforcement Learning for Humanoid Robot Dribbling
    Muzio, Alexandre F., V
    Maximo, Marcos R. O. A.
    Yoneyama, Takashi
    [J]. 2020 XVIII LATIN AMERICAN ROBOTICS SYMPOSIUM, 2020 XII BRAZILIAN SYMPOSIUM ON ROBOTICS AND 2020 XI WORKSHOP OF ROBOTICS IN EDUCATION (LARS-SBR-WRE 2020), 2020, : 246 - 251
  • [5] A Reinforcement Learning Method for Humanoid Robot Walking
    Liu, Yunda
    Bi, Sheng
    Dong, Min
    Zhang, Yingjie
    Huang, Jialing
    Zhang, Jiawei
    [J]. 2018 IEEE 8TH ANNUAL INTERNATIONAL CONFERENCE ON CYBER TECHNOLOGY IN AUTOMATION, CONTROL, AND INTELLIGENT SYSTEMS (IEEE-CYBER), 2018, : 623 - 628
  • [6] Deep Reinforcement Learning for Humanoid Robot Behaviors
    Alexandre F. V. Muzio
    Marcos R. O. A. Maximo
    Takashi Yoneyama
    [J]. Journal of Intelligent & Robotic Systems, 2022, 105
  • [7] Deep Reinforcement Learning for Humanoid Robot Behaviors
    Muzio, Alexandre F. V.
    Maximo, Marcos R. O. A.
    Yoneyama, Takashi
    [J]. Journal of Intelligent and Robotic Systems: Theory and Applications, 2022, 105 (01):
  • [8] Deep Reinforcement Learning for a Humanoid Robot Soccer Player
    Isaac Jesus da Silva
    Danilo Hernani Perico
    Thiago Pedro Donadon Homem
    Reinaldo Augusto da Costa Bianchi
    [J]. Journal of Intelligent & Robotic Systems, 2021, 102
  • [9] Deep Reinforcement Learning for a Humanoid Robot Soccer Player
    da Silva, Isaac Jesus
    Perico, Danilo Hernani
    Donadon Homem, Thiago Pedro
    da Costa Bianchi, Reinaldo Augusto
    [J]. JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2021, 102 (03)
  • [10] Optimization of a Compact Model for the Compliant Humanoid Robot COMAN Using Reinforcement Learning
    Colasanto, Luca
    Kormushev, Petar
    Tsagarakis, Nikolaos
    Caldwell, Darwin G.
    [J]. CYBERNETICS AND INFORMATION TECHNOLOGIES, 2012, 12 (03) : 76 - 85