Reinforcement learning for operational space control

被引:5
|
作者
Peters, Jan [1 ]
Schaal, Stefan [1 ]
机构
[1] Univ Southern Calif, Los Angeles, CA 90089 USA
关键词
D O I
10.1109/ROBOT.2007.363633
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
While operational space control is of essential importance for robotics and well-understood from an analytical point of view, it can be prohibitively hard to achieve accurate control in face of modeling errors, which are inevitable in complex robots, e.g., humanoid robots. In such cases, learning control methods can offer an interesting alternative to analytical control algorithms. However, the resulting supervised learning problem is ill-defined as it requires to learn an inverse mapping of a usually redundant system, which is well known to suffer from the property of non-convexity of the solution space, i.e., the learning system could generate motor commands that try to steer the robot into physically impossible configurations. The important insight that many operational space control algorithms can be reformulated as optimal control problems, however, allows addressing this inverse learning problem in the framework of reinforcement learning. However, few of the known optimization or reinforcement learning algorithms can be used in online learning control for robots, as they are either prohibitively slow, do not scale to interesting domains of complex robots, or require trying out policies generated by random search, which are infeasible for a physical system. Using a generalization of the EM-based reinforcement learning framework suggested by Dayan & Hinton, we reduce the problem of learning with immediate rewards to a reward-weighted regression problem with an adaptive, integrated reward transformation for faster convergence. The resulting algorithm is efficient, learns smoothly without dangerous jumps in solution space, and works well in applications of complex high degree-of-freedom robots.
引用
收藏
页码:2111 / +
页数:2
相关论文
共 50 条
  • [1] Single Leg Operational Space Control of Quadruped Robot based on Reinforcement Learning
    Rao, Jinhui
    An, Honglei
    Zhang, Taihui
    Chen, Yangzhen
    Ma, Hongxu
    [J]. 2016 IEEE CHINESE GUIDANCE, NAVIGATION AND CONTROL CONFERENCE (CGNCC), 2016, : 597 - 602
  • [2] Learning to control in operational space
    Peters, Jan
    Schaal, Steffan
    [J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2008, 27 (02): : 197 - 212
  • [3] Partitioning input space for reinforcement learning for control
    Hougen, DF
    Gini, M
    Slagle, J
    [J]. 1997 IEEE INTERNATIONAL CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, 1997, : 755 - 760
  • [4] Joint Space Control via Deep Reinforcement Learning
    Kumar, Visak
    Hoeller, David
    Sundaralingam, Balakumar
    Tremblay, Jonathan
    Birchfield, Stan
    [J]. 2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 3619 - 3626
  • [5] Operational Safe Control for Reinforcement-Learning-Based Robot Autonomy
    Zhou, Xu
    [J]. 2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 4091 - 4095
  • [6] Reinforcement Learning-Based Tracking Control of USVs in Varying Operational Conditions
    Martinsen, Andreas B.
    Lekkas, Anastasios M.
    Gros, Sebastien
    Glomsrud, Jon Arne
    Pedersen, Tom Arne
    [J]. FRONTIERS IN ROBOTICS AND AI, 2020, 7
  • [7] Reinforcement Learning for Traffic Signal Control in Hybrid Action Space
    Luo, Haoqing
    Bie, Yiming
    Jin, Sheng
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (06) : 5225 - 5241
  • [8] Reinforcement Learning in Continuous Time and Space: A Stochastic Control Approach
    Wang, Haoran
    Zariphopoulou, Thaleia
    Zhou, Xun Yu
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2020, 21
  • [9] Reinforcement learning in continuous time and space: A stochastic control approach
    Wang, Haoran
    Zariphopoulou, Thaleia
    Zhou, Xun Yu
    [J]. Journal of Machine Learning Research, 2020, 21
  • [10] Experiments of conditioned reinforcement learning in continuous space control tasks
    Fernandez-Gauna, Borja
    Osa, Juan Luis
    Grana, Manuel
    [J]. NEUROCOMPUTING, 2018, 271 : 38 - 47