Reinforcement learning for operational space control

被引:5
|
作者
Peters, Jan [1 ]
Schaal, Stefan [1 ]
机构
[1] Univ Southern Calif, Los Angeles, CA 90089 USA
关键词
D O I
10.1109/ROBOT.2007.363633
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
While operational space control is of essential importance for robotics and well-understood from an analytical point of view, it can be prohibitively hard to achieve accurate control in face of modeling errors, which are inevitable in complex robots, e.g., humanoid robots. In such cases, learning control methods can offer an interesting alternative to analytical control algorithms. However, the resulting supervised learning problem is ill-defined as it requires to learn an inverse mapping of a usually redundant system, which is well known to suffer from the property of non-convexity of the solution space, i.e., the learning system could generate motor commands that try to steer the robot into physically impossible configurations. The important insight that many operational space control algorithms can be reformulated as optimal control problems, however, allows addressing this inverse learning problem in the framework of reinforcement learning. However, few of the known optimization or reinforcement learning algorithms can be used in online learning control for robots, as they are either prohibitively slow, do not scale to interesting domains of complex robots, or require trying out policies generated by random search, which are infeasible for a physical system. Using a generalization of the EM-based reinforcement learning framework suggested by Dayan & Hinton, we reduce the problem of learning with immediate rewards to a reward-weighted regression problem with an adaptive, integrated reward transformation for faster convergence. The resulting algorithm is efficient, learns smoothly without dangerous jumps in solution space, and works well in applications of complex high degree-of-freedom robots.
引用
收藏
页码:2111 / +
页数:2
相关论文
共 50 条
  • [21] Motion control of a space manipulator using fuzzy sliding mode control with reinforcement learning
    Xie, Zhicheng
    Sun, Tao
    Kwan, Trevor
    Wu, Xiaofeng
    [J]. ACTA ASTRONAUTICA, 2020, 176 : 156 - 172
  • [22] Data-Driven Flotation Industrial Process Operational Optimal Control Based on Reinforcement Learning
    Jiang, Yi
    Fan, Jialu
    Chai, Tianyou
    Li, Jinna
    Lewis, Frank L.
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2018, 14 (05) : 1974 - 1989
  • [23] Traffic Signal Control Using Hybrid Action Space Deep Reinforcement Learning
    Bouktif, Salah
    Cheniki, Abderraouf
    Ouni, Ali
    [J]. SENSORS, 2021, 21 (07)
  • [24] Large space dimension Reinforcement Learning for Robot Position/Force Discrete Control
    Perrusquia, Adolfo
    Yu, Wen
    Soria, Alberto
    [J]. 2019 6TH INTERNATIONAL CONFERENCE ON CONTROL, DECISION AND INFORMATION TECHNOLOGIES (CODIT 2019), 2019, : 91 - 96
  • [25] Using reward-weighted regression for reinforcement learning of task space control
    Peters, Jan
    Schaal, Stefan
    [J]. 2007 IEEE INTERNATIONAL SYMPOSIUM ON APPROXIMATE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING, 2007, : 262 - +
  • [26] Reinforcement learning in continuous time and space
    Doya, K
    [J]. NEURAL COMPUTATION, 2000, 12 (01) : 219 - 245
  • [27] The operational space control applied to a space robotic manipulator
    Ferretti, G
    Magnani, GA
    Rocco, P
    Viganò, L
    di Milano, P
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS 1- 5, PROCEEDINGS, 2004, : 2550 - 2555
  • [28] An Analysis of the Operational Space Control of Robots
    Ngoc Dung Vuong
    Ang, Marcelo H., Jr.
    Lim, Tao Ming
    Lim, Ser Yong
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2010, : 4163 - 4168
  • [29] RECURSIVE FORMULATION OF OPERATIONAL SPACE CONTROL
    KREUTZDELGADO, K
    JAIN, A
    RODRIGUEZ, G
    [J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 1992, 11 (04): : 320 - 328
  • [30] Operational space control for a Puma robot
    Salinas, Sergio Alexander
    Larrarte, Eliana Aguilar
    Alban, Andres Vivas
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION, VOLS I-V, CONFERENCE PROCEEDINGS, 2007, : 2183 - 2188