Natural policy gradient reinforcement learning for a CPG control of a biped robot

被引:0
|
作者
Nakamura, Y [1 ]
Mori, T [1 ]
Ishii, S [1 ]
机构
[1] Nara Inst Sci & Technol, Nara 63001, Japan
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Motivated by the perspective that animals' rhythmic movements such as locomotion are controlled by neural circuits called central pattern generators (CPGs), motor control mechanisms by CPG have been studied. As an autonomous learning framework for a CPG controller, we previously proposed a reinforcement learning (RL) method called the CPG-actor-critic method. In this article, we propose a natural policy gradient learning algorithm for the CPG-actor-critic method, and applied our RL to an automatic control problem by a biped robot simulator. Computer simulations show that our RL makes the biped robot walk stably on various terrain.
引用
收藏
页码:972 / 981
页数:10
相关论文
共 50 条
  • [1] Reinforcement learning for a CPG-driven biped robot
    Mori, T
    Nakamura, Y
    Sato, M
    Ishii, S
    PROCEEDING OF THE NINETEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE SIXTEENTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2004, : 623 - 630
  • [2] Learning CPG-based biped locomotion with a policy gradient method: Application to a humanoid robot
    Endo, Gen
    Morimoto, Jun
    Matsubara, Takamitsu
    Nakanishi, Jun
    Cheng, Gordon
    INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2008, 27 (02): : 213 - 228
  • [3] Learning sensory feedback to CPG with policy gradient for biped locomotion
    Matsubara, T
    Morimoto, J
    Nakanishi, J
    Sato, MA
    Doya, K
    2005 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), VOLS 1-4, 2005, : 4164 - 4169
  • [4] Learning CPG-based biped locomotion with a policy gradient method
    Matsubara, T. (takam-m@atr.jp), (Inst. of Elec. and Elec. Eng. Computer Society, 445 Hoes Lane - P.O.Box 1331, Piscataway, NJ 08855-1331, United States):
  • [5] Learning CPG-based biped locomotion with a policy gradient method
    Matsubara, Takamitsu
    Morimoto, Jun
    Nakanishi, Jun
    Sato, Masa-aki
    Doya, Kenji
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2006, 54 (11) : 911 - 920
  • [6] Learning CPG-based biped locomotion with a policy gradient method
    Matsubara, T
    Morimoto, J
    Nakanishi, J
    Sato, M
    Doya, K
    2005 5TH IEEE-RAS INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS, 2005, : 208 - 213
  • [7] Reinforcement learning for a biped robot based on a CPG-actor-critic method
    Nakamura, Yutaka
    Mori, Takeshi
    Sato, Masa-Aki
    Ishii, Shin
    NEURAL NETWORKS, 2007, 20 (06) : 723 - 735
  • [8] Off-Policy Natural Policy Gradient Method for a Biped Walking Using a CPG Controller
    Nakamura, Yutaka
    Mori, Takeshi
    Tokita, Yoichi
    Shibata, Tomohiro
    Ishii, Shin
    JOURNAL OF ROBOTICS AND MECHATRONICS, 2005, 17 (06) : 636 - 644
  • [9] Deep reinforcement learning method for biped robot gait control
    Feng C.
    Zhang Y.
    Huang C.
    Jiang W.
    Wu Z.
    1600, CIMS (27): : 2341 - 2349
  • [10] Reinforcement learning control for biped robot walking on uneven surfaces
    Wang, Shouyi
    Braaksma, Jelmer
    Babuska, Robert
    Hobbelen, Daan
    2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 4173 - +