UAV Control Method Combining Reptile Meta-Reinforcement Learning and Generative Adversarial Imitation Learning

被引：1

作者：

Jiang, Shui ^{[1
]}

Ge, Yanning ^{[1
]}

Yang, Xu ^{[2
]}

Yang, Wencheng ^{[3
]}

Cui, Hui ^{[4
]}

机构：

[1] Fujian Normal Univ, Coll Comp & Cyber Secur, Fuzhou 350007, Peoples R China

[2] Minjiang Univ, Coll Comp & Control Engn, Fuzhou 350108, Peoples R China

[3] Univ Southern Queensland, Sch Math Phys & Comp, Darling Hts, Qld 4350, Australia

[4] Monash Univ, Dept Software Syst & Cybersecur, Melbourne, Vic 3800, Australia

来源：

FUTURE INTERNET | 2024年 / 16卷 / 03期

关键词：

unmanned aerial vehicles (UAVs); meta-reinforcement learning; generative adversarial imitation learning;

D O I：

10.3390/fi16030105

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Reinforcement learning (RL) is pivotal in empowering Unmanned Aerial Vehicles (UAVs) to navigate and make decisions efficiently and intelligently within complex and dynamic surroundings. Despite its significance, RL is hampered by inherent limitations such as low sample efficiency, restricted generalization capabilities, and a heavy reliance on the intricacies of reward function design. These challenges often render single-method RL approaches inadequate, particularly in the context of UAV operations where high costs and safety risks in real-world applications cannot be overlooked. To address these issues, this paper introduces a novel RL framework that synergistically integrates meta-learning and imitation learning. By leveraging the Reptile algorithm from meta-learning and Generative Adversarial Imitation Learning (GAIL), coupled with state normalization techniques for processing state data, this framework significantly enhances the model's adaptability. It achieves this by identifying and leveraging commonalities across various tasks, allowing for swift adaptation to new challenges without the need for complex reward function designs. To ascertain the efficacy of this integrated approach, we conducted simulation experiments within both two-dimensional environments. The empirical results clearly indicate that our GAIL-enhanced Reptile method surpasses conventional single-method RL algorithms in terms of training efficiency. This evidence underscores the potential of combining meta-learning and imitation learning to surmount the traditional barriers faced by reinforcement learning in UAV trajectory planning and decision-making processes.

引用

页数：18

共 50 条

[1] Model-based Adversarial Meta-Reinforcement Learning
Lin, Zichuan
Thomas, Garrett
Yang, Guangwen
Ma, Tengyu
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[2] Generative Adversarial Imitation Learning
Ho, Jonathan
Ermon, Stefano
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
[3] A Meta-Reinforcement Learning Approach to Process Control
McClement, Daniel G.
Lawrence, Nathan P.
Loewen, Philip D.
Forbes, Michael G.
Backstrom, Johan U.
Gopaluni, R. Bhushan
IFAC PAPERSONLINE, 2021, 54 (03): : 685 - 692
[4] Meta-Reinforcement Learning for Trajectory Design in Wireless UAV Networks
Hu, Ye
Chen, Mingzhe
Saad, Walid
Poor, H. Vincent
Cui, Shuguang
2020 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2020,
[5] Hypernetworks in Meta-Reinforcement Learning
Beck, Jacob
Jackson, Matthew
Vuorio, Risto
Whiteson, Shimon
CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 : 1478 - 1487
[6] Quantum generative adversarial imitation learning
Xiao, Tailong
Huang, Jingzheng
Li, Hongjing
Fan, Jianping
Zeng, Guihua
NEW JOURNAL OF PHYSICS, 2023, 25 (03):
[7] Deterministic generative adversarial imitation learning
Zuo, Guoyu
Chen, Kexin
Lu, Jiahao
Huang, Xiangsheng
NEUROCOMPUTING, 2020, 388 : 60 - 69
[8] Wireless Power Control via Meta-Reinforcement Learning
Lu, Ziyang
Gursoy, M. Cenk
IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC 2022), 2022, : 1562 - 1567
[9] Meta-Reinforcement Learning for Multiple Traffic Signals Control
Lou, Yican
Wu, Jia
Ran, Yunchuan
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 4264 - 4268
[10] GACS: Generative Adversarial Imitation Learning Based on Control Sharing
Huaiwei SI
Guozhen TAN
Dongyu LI
Yanfei PENG
Journal of Systems Science and Information, 2023, 11 (01) : 78 - 93

← 1 2 3 4 5 →