Adaptive Control for Linearizable Systems Using On-Policy Reinforcement Learning

被引:0
|
作者
Westenbroek, Tyler [1 ,2 ]
Mazumdar, Eric [1 ,2 ]
Fridovich-Keil, David [1 ,2 ]
Prabhu, Valmik [1 ,2 ]
Tomlin, Claire J. [1 ,2 ]
Sastry, S. Shankar [1 ,2 ]
机构
[1] Dept Elect Engn, Berkeley, CA 94720 USA
[2] Univ Calif Berkeley, Berkeley, CA 94720 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes a framework for adaptively learning a feedback linearization-based tracking controller for an unknown system using discrete-time model-free policy-gradient parameter update rules. The primary advantage of the scheme over standard model-reference adaptive control techniques is that it does not require the learned inverse model to be invertible at all instances of time. This enables the use of general function approximators to approximate the linearizing controller for the system without having to worry about singularities. The overall learning system is stochastic, due to the random nature of the policy gradient updates, thus we combine analysis techniques commonly employed in the machine learning literature alongside stability arguments from adaptive control to demonstrate that with high probability the tracking and parameter errors concentrate near zero, under a standard persistency of excitation condition. A simulated example of a double pendulum demonstrates the utility of the proposed theory.
引用
收藏
页码:118 / 125
页数:8
相关论文
共 50 条
  • [1] Fault-Tolerant Control of Degrading Systems with On-Policy Reinforcement Learning
    Ahmed, Ibrahim
    Quinones-Grueiro, Marcos
    Biswas, Gautam
    [J]. IFAC PAPERSONLINE, 2020, 53 (02): : 13733 - 13738
  • [2] Adaptive Optimal Control for Stochastic Multiplayer Differential Games Using On-Policy and Off-Policy Reinforcement Learning
    Liu, Mushuang
    Wan, Yan
    Lewis, Frank L.
    Lopez, Victor G.
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (12) : 5522 - 5533
  • [3] On-policy concurrent reinforcement learning
    Banerjee, B
    Sen, S
    Peng, J
    [J]. JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2004, 16 (04) : 245 - 260
  • [4] On-policy Q-learning for Adaptive Optimal Control
    Jha, Sumit Kumar
    Bhasin, Shubhendu
    [J]. 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), 2014, : 301 - 306
  • [5] Adaptive learning control for feedback linearizable systems
    Del Vecchio, D
    Marino, R
    Tomei, P
    [J]. EUROPEAN JOURNAL OF CONTROL, 2003, 9 (05) : 483 - 496
  • [6] An adaptive learning control for feedback linearizable systems
    Del Vecchio, D
    Marino, R
    Tomei, P
    [J]. PROCEEDINGS OF THE 2001 AMERICAN CONTROL CONFERENCE, VOLS 1-6, 2001, : 2817 - 2821
  • [7] Off-policy and on-policy reinforcement learning with the Tsetlin machine
    Saeed Rahimi Gorji
    Ole-Christoffer Granmo
    [J]. Applied Intelligence, 2023, 53 : 8596 - 8613
  • [8] Discussion on: "Adaptive learning control for feedback linearizable systems"
    Ha, IJ
    Ahn, BG
    Lee, JY
    [J]. EUROPEAN JOURNAL OF CONTROL, 2003, 9 (05) : 497 - 498
  • [9] Tabu search exploration for on-policy reinforcement learning
    Abramson, M
    Wechsler, H
    [J]. PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS 2003, VOLS 1-4, 2003, : 2910 - 2915
  • [10] Off-policy and on-policy reinforcement learning with the Tsetlin machine
    Gorji, Saeed Rahimi
    Granmo, Ole-Christoffer
    [J]. APPLIED INTELLIGENCE, 2023, 53 (08) : 8596 - 8613