Proximal policy optimization with an integral compensator for quadrotor control

被引:0
|
作者
Huan Hu
Qing-ling Wang
机构
[1] Southeast University,School of Automation
关键词
Reinforcement learning; Proximal policy optimization; Quadrotor control; Neural network; TP183; TP273;
D O I
暂无
中图分类号
学科分类号
摘要
We use the advanced proximal policy optimization (PPO) reinforcement learning algorithm to optimize the stochastic control strategy to achieve speed control of the “model-free” quadrotor. The model is controlled by four learned neural networks, which directly map the system states to control commands in an end-to-end style. By introducing an integral compensator into the actor-critic framework, the speed tracking accuracy and robustness have been greatly enhanced. In addition, a two-phase learning scheme which includes both offline- and online-learning is developed for practical use. A model with strong generalization ability is learned in the offline phase. Then, the flight policy of the model is continuously optimized in the online learning phase. Finally, the performances of our proposed algorithm are compared with those of the traditional PID algorithm.
引用
收藏
页码:777 / 795
页数:18
相关论文
共 50 条
  • [1] Proximal policy optimization with an integral compensator for quadrotor control
    Hu, Huan
    Wang, Qing-ling
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2020, 21 (05) : 777 - 795
  • [2] Deterministic Policy Gradient With Integral Compensator for Robust Quadrotor Control
    Wang, Yuanda
    Sun, Jia
    He, Haibo
    Sun, Changyin
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2020, 50 (10): : 3713 - 3725
  • [3] DEEP DETERMINISTIC POLICY GRADIENT WITH GENERALIZED INTEGRAL COMPENSATOR FOR HEIGHT CONTROL OF QUADROTOR
    Liu, Anlin
    Liu, Lei
    Cao, Jinde
    Alsaadi, Fawaz E.
    JOURNAL OF APPLIED ANALYSIS AND COMPUTATION, 2022, 12 (03): : 868 - 894
  • [4] Intelligent Control of a Quadrotor with Proximal Policy Optimization Reinforcement Learning
    Lopes, Guilherme Cano
    Ferreira, Murillo
    Simoes, Alexandre da Silva
    Colombini, Esther Luna
    15TH LATIN AMERICAN ROBOTICS SYMPOSIUM 6TH BRAZILIAN ROBOTICS SYMPOSIUM 9TH WORKSHOP ON ROBOTICS IN EDUCATION (LARS/SBR/WRE 2018), 2018, : 503 - 508
  • [5] An Improved Proximal Policy Optimization Method for Low-Level Control of a Quadrotor
    Xue, Wentao
    Wu, Hangxing
    Ye, Hui
    Shao, Shuyi
    ACTUATORS, 2022, 11 (04)
  • [6] Layered learning in a quadrotor drone: Simultaneous controlling and path planning using optimal fuzzy fractional order proportional integral derivative and proximal policy optimization
    Shahbazi, Hamed
    Tikani, Vahid
    Fatahi, Roholamin
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 136
  • [7] Dynamic Inversion Control of Quadrotor with Complementary Fuzzy Logic Compensator
    Rodic, Aleksandar D.
    Stojkovic, Ivan R.
    ELEVENTH SYMPOSIUM ON NEURAL NETWORK APPLICATIONS IN ELECTRICAL ENGINEERING (NEUREL 2012), 2012,
  • [8] Mixed-Autonomy Traffic Control with Proximal Policy Optimization
    Wei, Haoran
    Liu, Xuanzhang
    Mashayekhy, Lena
    Decker, Keith
    2019 IEEE VEHICULAR NETWORKING CONFERENCE (VNC), 2019,
  • [9] A Proximal Policy Optimization method in UAV swarm formation control
    Yu, Ning
    Juan, Feng
    Zhao, Hongwei
    ALEXANDRIA ENGINEERING JOURNAL, 2024, 100 : 268 - 276
  • [10] Integral Backstepping Based Nonlinear Control for Quadrotor
    Tian Congling
    Wang Jingwen
    Yin Zhaojie
    Yu Guohui
    PROCEEDINGS OF THE 35TH CHINESE CONTROL CONFERENCE 2016, 2016, : 10581 - 10585