Deterministic Policy Gradient With Integral Compensator for Robust Quadrotor Control

被引：109

作者：

Wang, Yuanda ^{[1
,2
]}

Sun, Jia ^{[3
]}

He, Haibo ^{[4
]}

Sun, Changyin ^{[1
,2
]}

机构：

[1] Southeast Univ, Sch Automat, Nanjing 210096, Peoples R China

[2] Southeast Univ, Key Lab Measurement & Control Complex Syst Engn, Minist Educ, Nanjing 210096, Peoples R China

[3] Univ Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing 10083, Peoples R China

[4] Univ Rhode Isl, Dept Elect Comp & Biomed Engn, Kingston, RI 02881 USA

来源：

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS | 2020年 / 50卷 / 10期

基金：

中国国家自然科学基金; 美国国家科学基金会;

关键词：

Reinforcement learning; Rotors; Helicopters; Neural networks; Aerodynamics; Heuristic algorithms; Robustness; Deterministic policy gradient (DPG); neural network; quadrotor; reinforcement learning; REINFORCEMENT; ATTITUDE;

D O I：

10.1109/TSMC.2018.2884725

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper, a deep reinforcement learning-based robust control strategy for quadrotor helicopters is proposed. The quadrotor is controlled by a learned neural network which directly maps the system states to control commands in an end-to-end style. The learning algorithm is developed based on the deterministic policy gradient algorithm. By introducing an integral compensator to the actor-critic structure, the tracking accuracy and robustness have been greatly enhanced. Moreover, a two-phase learning protocol which includes both offline and online learning phase is proposed for practical implementation. An offline policy is first learned based on a simplified quadrotor model. Then, the policy is online optimized in actual flight. The proposed approach is evaluated in the flight simulator. The results demonstrate that the offline learned policy is highly robust to model errors and external disturbances. It also shows that the online learning could significantly improve the control performance.

引用

页码：3713 / 3725

页数：13

共 50 条

[31] Collaborative temperature control of deep deterministic policy gradient and fuzzy PID
Wu M.
Wang X.-L.
Jiang Y.-D.
Zhong L.
Mo F.-Y.
Kongzhi Lilun Yu Yingyong/Control Theory and Applications, 2022, 39 (12): : 2358 - 2365
[32] Mars Unmanned Aerial Vehicles Control with Deep Deterministic Policy Gradient
Sun, Dan
Zheng, Jianhua
Gao, Dong
Han, Peng
Computer Engineering and Applications, 2023, 59 (08) : 288 - 929
[33] Robust tracking control of quadrotor via on-policy adaptive dynamic programming
Dou, Liqian
Su, Xiaotong
Zhao, Xinyi
Zong, Qun
He, Lei
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2021, 31 (07) : 2509 - 2525
[34] Deep Deterministic Policy Gradient for Traffic Signal Control of Single Intersection
Pang, Hali
Gao, Weilong
PROCEEDINGS OF THE 2019 31ST CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2019), 2019, : 5861 - 5866
[35] Platoon Control of Automatic Vehicles Based on Deep Deterministic Policy Gradient
Luo, Xiaoyuan
Chen, Tian
Li, Mengjie
Li, Shaobao
2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 6154 - 6159
[36] Robust Integral Sliding Mode Controller for Quadrotor Flight
Yu, Hang
Wu, Shiqian
Lv, Qin
Zhou, Yimin
Liu, Siyuan
2017 CHINESE AUTOMATION CONGRESS (CAC), 2017, : 7352 - 7356
[37] Feature selection in deterministic policy gradient
Li, Luntong
Li, Dazi
Song, Tianheng
JOURNAL OF ENGINEERING-JOE, 2020, 2020 (13): : 403 - 406
[38] Deterministic Policy Gradient: Convergence Analysis
Xiong, Huaqing
Xu, Tengyu
Zhao, Lin
Liang, Yingbin
Zhang, Wei
UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, VOL 180, 2022, 180 : 2159 - 2169
[39] Dynamic Inversion Control of Quadrotor with Complementary Fuzzy Logic Compensator
Rodic, Aleksandar D.
Stojkovic, Ivan R.
ELEVENTH SYMPOSIUM ON NEURAL NETWORK APPLICATIONS IN ELECTRICAL ENGINEERING (NEUREL 2012), 2012,
[40] Robust Backstepping Controller Design with a Fuzzy Compensator for Autonomous Hovering Quadrotor UAV
Mohd Ariffanan Mohd Basri
Iranian Journal of Science and Technology, Transactions of Electrical Engineering, 2018, 42 : 379 - 391

← 1 2 3 4 5 →