Deterministic Policy Gradient With Integral Compensator for Robust Quadrotor Control

被引:109
|
作者
Wang, Yuanda [1 ,2 ]
Sun, Jia [3 ]
He, Haibo [4 ]
Sun, Changyin [1 ,2 ]
机构
[1] Southeast Univ, Sch Automat, Nanjing 210096, Peoples R China
[2] Southeast Univ, Key Lab Measurement & Control Complex Syst Engn, Minist Educ, Nanjing 210096, Peoples R China
[3] Univ Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing 10083, Peoples R China
[4] Univ Rhode Isl, Dept Elect Comp & Biomed Engn, Kingston, RI 02881 USA
基金
中国国家自然科学基金; 美国国家科学基金会;
关键词
Reinforcement learning; Rotors; Helicopters; Neural networks; Aerodynamics; Heuristic algorithms; Robustness; Deterministic policy gradient (DPG); neural network; quadrotor; reinforcement learning; REINFORCEMENT; ATTITUDE;
D O I
10.1109/TSMC.2018.2884725
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, a deep reinforcement learning-based robust control strategy for quadrotor helicopters is proposed. The quadrotor is controlled by a learned neural network which directly maps the system states to control commands in an end-to-end style. The learning algorithm is developed based on the deterministic policy gradient algorithm. By introducing an integral compensator to the actor-critic structure, the tracking accuracy and robustness have been greatly enhanced. Moreover, a two-phase learning protocol which includes both offline and online learning phase is proposed for practical implementation. An offline policy is first learned based on a simplified quadrotor model. Then, the policy is online optimized in actual flight. The proposed approach is evaluated in the flight simulator. The results demonstrate that the offline learned policy is highly robust to model errors and external disturbances. It also shows that the online learning could significantly improve the control performance.
引用
收藏
页码:3713 / 3725
页数:13
相关论文
共 50 条
  • [31] Collaborative temperature control of deep deterministic policy gradient and fuzzy PID
    Wu M.
    Wang X.-L.
    Jiang Y.-D.
    Zhong L.
    Mo F.-Y.
    Kongzhi Lilun Yu Yingyong/Control Theory and Applications, 2022, 39 (12): : 2358 - 2365
  • [32] Mars Unmanned Aerial Vehicles Control with Deep Deterministic Policy Gradient
    Sun, Dan
    Zheng, Jianhua
    Gao, Dong
    Han, Peng
    Computer Engineering and Applications, 2023, 59 (08) : 288 - 929
  • [33] Robust tracking control of quadrotor via on-policy adaptive dynamic programming
    Dou, Liqian
    Su, Xiaotong
    Zhao, Xinyi
    Zong, Qun
    He, Lei
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2021, 31 (07) : 2509 - 2525
  • [34] Deep Deterministic Policy Gradient for Traffic Signal Control of Single Intersection
    Pang, Hali
    Gao, Weilong
    PROCEEDINGS OF THE 2019 31ST CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2019), 2019, : 5861 - 5866
  • [35] Platoon Control of Automatic Vehicles Based on Deep Deterministic Policy Gradient
    Luo, Xiaoyuan
    Chen, Tian
    Li, Mengjie
    Li, Shaobao
    2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 6154 - 6159
  • [36] Robust Integral Sliding Mode Controller for Quadrotor Flight
    Yu, Hang
    Wu, Shiqian
    Lv, Qin
    Zhou, Yimin
    Liu, Siyuan
    2017 CHINESE AUTOMATION CONGRESS (CAC), 2017, : 7352 - 7356
  • [37] Feature selection in deterministic policy gradient
    Li, Luntong
    Li, Dazi
    Song, Tianheng
    JOURNAL OF ENGINEERING-JOE, 2020, 2020 (13): : 403 - 406
  • [38] Deterministic Policy Gradient: Convergence Analysis
    Xiong, Huaqing
    Xu, Tengyu
    Zhao, Lin
    Liang, Yingbin
    Zhang, Wei
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, VOL 180, 2022, 180 : 2159 - 2169
  • [39] Dynamic Inversion Control of Quadrotor with Complementary Fuzzy Logic Compensator
    Rodic, Aleksandar D.
    Stojkovic, Ivan R.
    ELEVENTH SYMPOSIUM ON NEURAL NETWORK APPLICATIONS IN ELECTRICAL ENGINEERING (NEUREL 2012), 2012,
  • [40] Robust Backstepping Controller Design with a Fuzzy Compensator for Autonomous Hovering Quadrotor UAV
    Mohd Ariffanan Mohd Basri
    Iranian Journal of Science and Technology, Transactions of Electrical Engineering, 2018, 42 : 379 - 391