Reinforcement learning and optimal adaptive control: An overview and implementation examples

被引：156

作者：

Khan, Said G. ^{[1
]}

Herrmann, Guido ^{[2
,3
]}

Lewis, Frank L. ^{[4
]}

Pipe, Tony ^{[1
]}

Melhuish, Chris ^{[5
]}

机构：

[1] Univ W England, Bristol Robot Lab, Bristol BS16 1QY, Avon, England

[2] Univ Bristol, Bristol Robot Lab, Bristol, Avon, England

[3] Univ Bristol, Dept Mech Engn, Bristol, Avon, England

[4] Univ Texas Arlington, Automat & Robot Res Inst, Arlington, TX USA

[5] Univ Bristol, Bristol Robot Lab, Bristol, Avon, England

来源：

ANNUAL REVIEWS IN CONTROL | 2012年 / 36卷 / 01期

基金：

美国国家科学基金会;

关键词：

Reinforcement learning; ADP; Q-learning; Optimal adaptive control; ADP; SYSTEMS;

D O I：

10.1016/j.arcontrol.2012.03.004

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper provides an overview of the reinforcement learning and optimal adaptive control literature and its application to robotics. Reinforcement learning is bridging the gap between traditional optimal control, adaptive control and bio-inspired learning techniques borrowed from animals. This work is highlighting some of the key techniques presented by well known researchers from the combined areas of reinforcement learning and optimal control theory. At the end, an example of an implementation of a novel model-free Q-learning based discrete optimal adaptive controller for a humanoid robot arm is presented. The controller uses a novel adaptive dynamic programming (ADP) reinforcement learning (RI) approach to develop an optimal policy on-line. The RI joint space tracking controller was implemented for two links (shoulder flexion and elbow flexion joints) of the arm of the humanoid Bristol-Elumotion-Robotic-Torso II (BERT II) torso. The constrained case (joint limits) of the RL scheme was tested for a single link (elbow flexion) of the BERT II arm by modifying the cost function to deal with the extra nonlinearity due to the joint constraints. (C) 2012 Elsevier Ltd. All rights reserved.

引用

页码：42 / 59

页数：18

共 50 条

[1] Reinforcement Learning and Optimal Adaptive Control: An Overview and Implementation Examples (vol 36, pg 42, 2012)
Khan, Said G.
Herrmann, Guido
Lewis, Frank L.
Pipe, Tony
Melhuish, Chris
ANNUAL REVIEWS IN CONTROL, 2012, 36 (02) : 359 - 359
[2] Reinforcement Learning and Adaptive Optimal Control of Congestion Pricing
Nguyen, Tri
Gao, Weinan
Zhong, Xiangnan
Agarwal, Shaurya
IFAC PAPERSONLINE, 2021, 54 (02): : 221 - 226
[3] Online adaptive algorithm for optimal control with integral reinforcement learning
Vamvoudakis, Kyriakos G.
Vrabie, Draguna
Lewis, Frank L.
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2014, 24 (17) : 2686 - 2710
[4] Constrained adaptive optimal control using a reinforcement learning agent
Lin, Wei-Song
Zheng, Chen-Hong
AUTOMATICA, 2012, 48 (10) : 2614 - 2619
[5] An Adaptive Implementation of ε-Greedy in Reinforcement Learning
Mignon, Alexandre dos Santos
de Azevedo da Rocha, Ricardo Luis
8TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT-2017) AND THE 7TH INTERNATIONAL CONFERENCE ON SUSTAINABLE ENERGY INFORMATION TECHNOLOGY (SEIT 2017), 2017, 109 : 1146 - 1151
[6] Optimal adaptive control of drug dosing using integral reinforcement learning
Padmanabhan, Regina
Meskin, Nader
Haddad, Wassim M.
MATHEMATICAL BIOSCIENCES, 2019, 309 : 131 - 142
[7] Adaptive optimal trajectory tracking control of AUVs based on reinforcement learning
Li, Zhifu
Wang, Ming
Ma, Ge
ISA TRANSACTIONS, 2023, 137 : 122 - 132
[8] Adaptive optimal control of stencil printing process using reinforcement learning
Khader, Nourma
Yoon, Sang Won
ROBOTICS AND COMPUTER-INTEGRATED MANUFACTURING, 2021, 71
[9] Multiobjective Reinforcement Learning for Reconfigurable Adaptive Optimal Control of Manufacturing Processes
Dornheim, Johannes
Link, Norbert
2018 13TH INTERNATIONAL SYMPOSIUM ON ELECTRONICS AND TELECOMMUNICATIONS (ISETC), 2018, : 97 - 101
[10] Reinforcement Learning for Adaptive Optimal Stationary Control of Linear Stochastic Systems
Pang, Bo
Jiang, Zhong-Ping
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (04) : 2383 - 2390

← 1 2 3 4 5 →