Simulation-based evaluation of model-free reinforcement learning algorithms for quadcopter attitude control and trajectory tracking

被引:0
|
作者
Yuste, Pablo Caffyn [1 ]
Iglesias Martinez, Jose Antonio [1 ]
Sanchis de Miguel, Maria Araceli [1 ]
机构
[1] Univ Carlos III Madrid, Dept Comp Sci & Engn, Leganes, Spain
关键词
Reinforcement learning; Continuous control; Model-free; Quadcopter; SYSTEM;
D O I
10.1016/j.neucom.2024.128362
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
General use quadcopters have been under development for over a decade but many of their potential applications are still under evaluation and have not yet been adopted in many of the areas that could benefit from their use. While the current generation of quadcopters use a mature set of control algorithms, the next steps, especially as autonomous features are developed, should involve a more complex learning capability to be able to adapt to unknown circumstances in a safe and reliable way. This paper provides baseline quadcopter control models learnt using eight general reinforcement learning (RL) algorithms in a simulated environment, with the object of establishing a reference performance, both in terms of precision and generation cost, for a simple set of trajectories. Each algorithm uses a tailored set of hyperparameters while, additionally, the influence of random seeds is also studied. While not all algorithms converge in the allocated computing budget, the more complex ones are able to provide stable and precise control models. This paper recommends the use of the TD3 algorithm as a reference for comparison with new RL algorithms. Additional guidance for future work is provided based on the weaknesses identified in the learning process, especially regarding the strong dependence of agent performance on random seeds.
引用
下载
收藏
页数:16
相关论文
共 50 条
  • [31] Controlled interacting particle algorithms for simulation-based reinforcement learning
    Joshi, Anant A.
    Taghvaei, Amirhossein
    Mehta, Prashant G.
    Meyn, Sean P.
    SYSTEMS & CONTROL LETTERS, 2022, 170
  • [32] Model-Free Preference-Based Reinforcement Learning
    Wirth, Christian
    Fuernkranz, Johannes
    Neumann, Gerhard
    THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 2222 - 2228
  • [33] Model-free learning control of neutralization processes using reinforcement learning
    Syafiie, S.
    Tadeo, F.
    Martinez, E.
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2007, 20 (06) : 767 - 782
  • [34] Model-free attitude synchronization for multiple heterogeneous quadrotors via reinforcement learning
    Zhao, Wanbing
    Liu, Hao
    Wang, Bohui
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2021, 36 (06) : 2528 - 2547
  • [35] Linear Quadratic Control Using Model-Free Reinforcement Learning
    Yaghmaie, Farnaz Adib
    Gustafsson, Fredrik
    Ljung, Lennart
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (02) : 737 - 752
  • [36] Model-Free Reinforcement Learning of Impedance Control in Stochastic Environments
    Stulp, Freek
    Buchli, Jonas
    Ellmer, Alice
    Mistry, Michael
    Theodorou, Evangelos A.
    Schaal, Stefan
    IEEE TRANSACTIONS ON AUTONOMOUS MENTAL DEVELOPMENT, 2012, 4 (04) : 330 - 341
  • [37] On Distributed Model-Free Reinforcement Learning Control With Stability Guarantee
    Mukherjee, Sayak
    Vu, Thanh Long
    IEEE CONTROL SYSTEMS LETTERS, 2021, 5 (05): : 1615 - 1620
  • [38] Model-Free Recurrent Reinforcement Learning for AUV Horizontal Control
    Huo, Yujia
    Li, Yiping
    Feng, Xisheng
    3RD INTERNATIONAL CONFERENCE ON AUTOMATION, CONTROL AND ROBOTICS ENGINEERING (CACRE 2018), 2018, 428
  • [39] Quadcopter Trajectory Tracking Control Based on Flatness Model Predictive Control and Neural Network
    Li, Yong
    Zhu, Qidan
    Elahi, Ahsan
    ACTUATORS, 2024, 13 (04)
  • [40] Model-Free Primitive-Based Iterative Learning Control Approach to Trajectory Tracking of MIMO Systems With Experimental Validation
    Radac, Mircea-Bogdan
    Precup, Radu-Emil
    Petriu, Emil M.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (11) : 2925 - 2938