Advanced Policy Learning Near-Optimal Regulation

被引:0
|
作者
Ding Wang [1 ,2 ]
Xiangnan Zhong [1 ,3 ]
机构
[1] IEEE
[2] the Faculty of Information Technology, Beijing University of Technology, and also with the Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology
[3] the Department of Electrical Engineering, University of North Texas
基金
中国国家自然科学基金;
关键词
Adaptive critic algorithm; learning control; neural approximation; nonaffine dynamics; optimal regulation;
D O I
暂无
中图分类号
O232 [最优控制];
学科分类号
摘要
Designing advanced design techniques for feedback stabilization and optimization of complex systems is important to the modern control field. In this paper, a near-optimal regulation method for general nonaffine dynamics is developed with the help of policy learning. For addressing the nonaffine nonlinearity, a pre-compensator is constructed, so that the augmented system can be formulated as affine-like form. Different cost functions are defined for original and transformed controlled plants and then their relationship is analyzed in detail. Additionally, an adaptive critic algorithm involving stability guarantee is employed to solve the augmented optimal control problem. At last, several case studies are conducted for verifying the stability, robustness, and optimality of a torsional pendulum plant with suitable cost.
引用
收藏
页码:743 / 749
页数:7
相关论文
共 50 条
  • [11] Near-Optimal Reinforcement Learning in Polynomial Time
    Michael Kearns
    Satinder Singh
    Machine Learning, 2002, 49 : 209 - 232
  • [12] Near-optimal learning with average Holder smoothness
    Hanneke, Steve
    Kontorovich, Aryeh
    Kornowski, Guy
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [13] Near-optimal reinforcement learning in polynomial time
    Kearns, M
    Singh, S
    MACHINE LEARNING, 2002, 49 (2-3) : 209 - 232
  • [14] Near-optimal Regret Bounds for Reinforcement Learning
    Jaksch, Thomas
    Ortner, Ronald
    Auer, Peter
    JOURNAL OF MACHINE LEARNING RESEARCH, 2010, 11 : 1563 - 1600
  • [15] A Near-Optimal Primal-Dual Method for Off-Policy Learning in CMDP
    Chen, Fan
    Zhang, Junyu
    Wen, Zaiwen
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [16] Near-optimal regret bounds for reinforcement learning
    Jaksch, Thomas
    Ortner, Ronald
    Auer, Peter
    Journal of Machine Learning Research, 2010, 11 : 1563 - 1600
  • [17] Near-optimal Reinforcement Learning in Factored MDPs
    Osband, Ian
    Van Roy, Benjamin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
  • [18] A Near-Optimal Maintenance Policy for Automated DR Devices
    Abad, Carlos
    Iyengar, Garud
    IEEE TRANSACTIONS ON SMART GRID, 2016, 7 (03) : 1411 - 1419
  • [19] One Policy is Enough: Parallel Exploration with a Single Policy is Near-Optimal for Reward-Free Reinforcement Learning
    Cisneros-Velarde, Pedro
    Lyu, Boxiang
    Koyejo, Sanmi
    Kolar, Mladen
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 206, 2023, 206
  • [20] Near-Optimal Reinforcement Learning in Dynamic Treatment Regimes
    Zhang, Junzhe
    Bareinboim, Elias
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32