An Actor Critic Method for Free Terminal Time Optimal Control

被引:0
|
作者
Burton, Evan [1 ]
Nakamura-Zimmerer, Tenavi [1 ,2 ]
Gong, Qi [1 ]
Kang, Wei [1 ,3 ]
机构
[1] Univ Calif Santa Cruz, Santa Cruz, CA 95060 USA
[2] NASA Langley Res Ctr, Flight Dynam Branch, Hampton, VA 23666 USA
[3] Naval Postgrad Sch, Monterey, CA 93943 USA
来源
IFAC PAPERSONLINE | 2023年 / 56卷 / 01期
基金
美国国家科学基金会;
关键词
Iterative learning control; Non-smooth and discontinuous optimal control problems;
D O I
10.1016/j.ifacol.2023.02.009
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Optimal control problems with free terminal time present many challenges including nonsmooth and discontinuous control laws, irregular value functions, many local optima, and the curse of dimensionality. To overcome these issues, we propose an adaptation of the model-based actor-critic paradigm from the field of Reinforcement Learning via an exponential transformation to learn an approximate feedback control and value function pair. We demonstrate the algorithm's effectiveness on prototypical examples featuring each of the main pathological issues present in problems of this type. Copyright (c) 2023 The Authors. This is an open access article under the CC BY-NC-ND license (<THESTERM>https://creativecommons.org/licenses/by-ne-nd/4.0/</THESTERM>)
引用
收藏
页码:49 / 54
页数:6
相关论文
共 50 条
  • [21] Model-free optimal containment control of multi-agent systems based on actor-critic framework
    Wang, Wei
    Chen, Xin
    NEUROCOMPUTING, 2018, 314 : 242 - 250
  • [22] A Variational Approach to Perturbation Feedback Control for Optimal Control Problems with Terminal Constraints and Free Terminal Time
    Sankalp Bhan
    Heinz Schättler
    Set-Valued and Variational Analysis, 2019, 27 : 309 - 330
  • [23] A Variational Approach to Perturbation Feedback Control for Optimal Control Problems with Terminal Constraints and Free Terminal Time
    Bhan, Sankalp
    Schattler, Heinz
    SET-VALUED AND VARIATIONAL ANALYSIS, 2019, 27 (02) : 309 - 330
  • [24] Multiple Actor-Critic Structures for Continuous-Time Optimal Control Using Input-Output Data
    Song, Ruizhuo
    Lewis, Frank
    Wei, Qinglai
    Zhang, Hua-Guang
    Jiang, Zhong-Ping
    Levine, Dan
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (04) : 851 - 865
  • [25] Optimal Elevator Group Control via Deep Asynchronous Actor-Critic Learning
    Wei, Qinglai
    Wang, Lingxiao
    Liu, Yu
    Polycarpou, Marios M.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (12) : 5245 - 5256
  • [26] HVAC Optimal Control with the Multistep-Actor Critic Algorithm in Large Action Spaces
    Huang, Zetian
    Chen, Jianping
    Fu, Qiming
    Wu, Hongjie
    Lu, You
    Gao, Zhen
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2020, 2020
  • [27] Generalized Actor-Critic Learning Optimal Control in Smart Home Energy Management
    Wei, Qinglai
    Liao, Zehua
    Shi, Guang
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2021, 17 (10) : 6614 - 6623
  • [28] Traffic signal control method based on asynchronous advantage actor-critic
    Ye, Baolin
    Sun, Ruitao
    Wu, Weimin
    Chen, Bin
    Yao, Qing
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2024, 58 (08): : 1671 - 1680
  • [29] NUMERICAL METHOD FOR SOLVING OPTIMAL CONTROL PROBLEMS WITH UNSPECIFIED TERMINAL TIME
    QUINTANA, VH
    DAVISON, EJ
    INTERNATIONAL JOURNAL OF CONTROL, 1973, 17 (01) : 97 - 115
  • [30] Actor-Critic-Based Predefined-Time Fuzzy Adaptive Optimal Control for Uncertain Nonlinear Systems With Input Saturation
    Yang, Wei
    Wang, Qing-Guo
    Liu, Jiapeng
    Yu, Jinpeng
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2024, 32 (04) : 2448 - 2457