Reinforcement learning and optimal adaptive control: An overview and implementation examples

被引:146
|
作者
Khan, Said G. [1 ]
Herrmann, Guido [2 ,3 ]
Lewis, Frank L. [4 ]
Pipe, Tony [1 ]
Melhuish, Chris [5 ]
机构
[1] Univ W England, Bristol Robot Lab, Bristol BS16 1QY, Avon, England
[2] Univ Bristol, Bristol Robot Lab, Bristol, Avon, England
[3] Univ Bristol, Dept Mech Engn, Bristol, Avon, England
[4] Univ Texas Arlington, Automat & Robot Res Inst, Arlington, TX USA
[5] Univ Bristol, Bristol Robot Lab, Bristol, Avon, England
基金
美国国家科学基金会;
关键词
Reinforcement learning; ADP; Q-learning; Optimal adaptive control; ADP; SYSTEMS;
D O I
10.1016/j.arcontrol.2012.03.004
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper provides an overview of the reinforcement learning and optimal adaptive control literature and its application to robotics. Reinforcement learning is bridging the gap between traditional optimal control, adaptive control and bio-inspired learning techniques borrowed from animals. This work is highlighting some of the key techniques presented by well known researchers from the combined areas of reinforcement learning and optimal control theory. At the end, an example of an implementation of a novel model-free Q-learning based discrete optimal adaptive controller for a humanoid robot arm is presented. The controller uses a novel adaptive dynamic programming (ADP) reinforcement learning (RI) approach to develop an optimal policy on-line. The RI joint space tracking controller was implemented for two links (shoulder flexion and elbow flexion joints) of the arm of the humanoid Bristol-Elumotion-Robotic-Torso II (BERT II) torso. The constrained case (joint limits) of the RL scheme was tested for a single link (elbow flexion) of the BERT II arm by modifying the cost function to deal with the extra nonlinearity due to the joint constraints. (C) 2012 Elsevier Ltd. All rights reserved.
引用
收藏
页码:42 / 59
页数:18
相关论文
共 50 条
  • [31] QTCP: Adaptive Congestion Control with Reinforcement Learning
    Li, Wei
    Zhou, Fan
    Chowdhury, Kaushik Roy
    Meleis, Waleed
    [J]. IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2019, 6 (03): : 445 - 458
  • [32] Reinforcement learning to adaptive control of nonlinear systems
    Hwang, KS
    Tan, SW
    Tsai, MC
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2003, 33 (03): : 514 - 521
  • [33] Adaptive reinforcement learning system for linearization control
    Hwang, KS
    Chao, HJ
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2000, 47 (05) : 1185 - 1188
  • [34] Optimal chaos control through reinforcement learning
    Gadaleta, S
    Dangelmayr, G
    [J]. CHAOS, 1999, 9 (03) : 775 - 788
  • [35] Connecting stochastic optimal control and reinforcement learning
    Quer, J.
    Borrell, Enric Ribera
    [J]. JOURNAL OF MATHEMATICAL PHYSICS, 2024, 65 (08)
  • [36] Coordinated reinforcement learning for decentralized optimal control
    Yagan, Daniel
    Tharn, Chen-Khong
    [J]. 2007 IEEE INTERNATIONAL SYMPOSIUM ON APPROXIMATE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING, 2007, : 296 - +
  • [37] The Challenges of Reinforcement Learning in Robotics and Optimal Control
    El-Telbany, Mohammed E.
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT SYSTEMS AND INFORMATICS 2016, 2017, 533 : 881 - 890
  • [38] Reinforcement Learning for Optimal Control of Queueing Systems
    Liu, Bai
    Xie, Qiaomin
    Modiano, Eytan
    [J]. 2019 57TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2019, : 663 - 670
  • [39] Reinforcement Learning for Model Problems of Optimal Control
    S. S. Semenov
    V. I. Tsurkov
    [J]. Journal of Computer and Systems Sciences International, 2023, 62 : 508 - 521
  • [40] Adaptive linearization control based on reinforcement learning
    Hwang, KS
    Chiou, JY
    [J]. 2002 IEEE REGION 10 CONFERENCE ON COMPUTERS, COMMUNICATIONS, CONTROL AND POWER ENGINEERING, VOLS I-III, PROCEEDINGS, 2002, : 1483 - 1486