Learning-based tracking control of AUV: Mixed policy improvement and game-based disturbance rejection

被引:0
|
作者
Ye, Jun [1 ]
Gao, Hongbo [2 ,3 ,4 ]
Hu, Manjiang [1 ,5 ]
Bian, Yougang [1 ,5 ]
Cui, Qingjia [1 ,5 ]
Qin, Xiaohui [1 ,5 ]
Ding, Rongjun [1 ,5 ]
机构
[1] Hunan Univ, Coll Mech & Vehicle Engn, State Key Lab Adv Design & Mfg Technol Vehicle, Changsha, Peoples R China
[2] Univ Sci & Technol China, Sch Informat Sci & Technol, Hefei, Peoples R China
[3] Univ Sci & Technol China, Inst Adv Technol, Hefei, Peoples R China
[4] Nanyang Technol Univ, Singapore, Singapore
[5] Hunan Univ, Wuxi Intelligent Control Res Inst, Wuxi, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
autonomous vehicle; dynamic programming; intelligent control; intelligent robots; learning (artificial intelligence); trajectory control; AUTONOMOUS UNDERWATER VEHICLES; DYNAMIC-PROGRAMMING ALGORITHM; MODEL-PREDICTIVE CONTROL; TIME NONLINEAR-SYSTEMS; ITERATION;
D O I
10.1049/cit2.12372
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A mixed adaptive dynamic programming (ADP) scheme based on zero-sum game theory is developed to address optimal control problems of autonomous underwater vehicle (AUV) systems subject to disturbances and safe constraints. By combining prior dynamic knowledge and actual sampled data, the proposed approach effectively mitigates the defect caused by the inaccurate dynamic model and significantly improves the training speed of the ADP algorithm. Initially, the dataset is enriched with sufficient reference data collected based on a nominal model without considering modelling bias. Also, the control object interacts with the real environment and continuously gathers adequate sampled data in the dataset. To comprehensively leverage the advantages of model-based and model-free methods during training, an adaptive tuning factor is introduced based on the dataset that possesses model-referenced information and conforms to the distribution of the real-world environment, which balances the influence of model-based control law and data-driven policy gradient on the direction of policy improvement. As a result, the proposed approach accelerates the learning speed compared to data-driven methods, concurrently also enhancing the tracking performance in comparison to model-based control methods. Moreover, the optimal control problem under disturbances is formulated as a zero-sum game, and the actor-critic-disturbance framework is introduced to approximate the optimal control input, cost function, and disturbance policy, respectively. Furthermore, the convergence property of the proposed algorithm based on the value iteration method is analysed. Finally, an example of AUV path following based on the improved line-of-sight guidance is presented to demonstrate the effectiveness of the proposed method.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] Reinforcement Learning-based Active Disturbance Rejection Control for Nonlinear Systems with Disturbance
    Kong, Xiangyu
    Xia, Yuanqing
    2023 2ND CONFERENCE ON FULLY ACTUATED SYSTEM THEORY AND APPLICATIONS, CFASTA, 2023, : 799 - 804
  • [2] ADRC-SMC-based disturbance rejection depth-tracking control of underactuated AUV
    Liu, Chuan
    Xiang, Xianbo
    Duan, Yu
    Yang, Lichun
    Yang, Shaolong
    JOURNAL OF FIELD ROBOTICS, 2024, 41 (04) : 1103 - 1115
  • [3] Eye Tracking in Game-based Learning Research and Game Design
    Kiili, Kristian
    Ketamo, Harri
    Kickmeier-Rust, Michael D.
    INTERNATIONAL JOURNAL OF SERIOUS GAMES, 2014, 1 (02): : 51 - 65
  • [4] Game-based distributed optimal formation tracking control of underactuated AUVs based on reinforcement learning
    Wang, Zhengkun
    Zhang, Lijun
    Zhu, Zeyu
    OCEAN ENGINEERING, 2023, 287
  • [5] Deep Reinforcement Learning-Based Wind Disturbance Rejection Control Strategy for UAV
    Ma, Qun
    Wu, Yibo
    Shoukat, Muhammad Usman
    Yan, Yukai
    Wang, Jun
    Yang, Long
    Yan, Fuwu
    Yan, Lirong
    DRONES, 2024, 8 (11)
  • [6] Learning-Based Model Predictive Control for Addressing Model Mismatch in AUV Trajectory Tracking
    Shen, Xuyu
    Jiao, Huifeng
    Sun, Gongwu
    Hu, Xuanyu
    Zhao, Yuchen
    Chu, Zhenzhong
    Chen, Qi
    INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2024, PT VI, 2025, 15206 : 45 - 56
  • [7] Game-based learning
    Reena Wadia
    British Dental Journal, 2024, 237 (10) : 793 - 793
  • [8] Trajectory tracking of quadrotor based on disturbance rejection control
    Wu C.
    Su J.-B.
    Su, Jian-Bo (jbsu@sjtu.edu.cn), 2016, South China University of Technology (33): : 1422 - 1430
  • [9] Attitude tracking of aircraft based on disturbance rejection control
    Su, J.-B. (jbsu@sjtu.edu.cn), 1609, South China University of Technology (30):
  • [10] Trajectory Tracking Control for Under-Actuated Hovercraft Using Differential Flatness and Reinforcement Learning-Based Active Disturbance Rejection Control
    KONG Xiangyu
    XIA Yuanqing
    HU Rui
    LIN Min
    SUN Zhongqi
    DAI Li
    Journal of Systems Science & Complexity, 2022, 35 (02) : 502 - 521