Research on automatic pilot repetition generation method based on deep reinforcement learning

被引：2

作者：

Pan, Weijun ^{[1
]}

Jiang, Peiyuan ^{[1
]}

Li, Yukun ^{[1
]}

Wang, Zhuang ^{[1
]}

Huang, Junxiang ^{[2
]}

机构：

[1] Civil Aviat Flight Univ China, Coll Air Traff Management, Air Traff Control Automation Lab, Deyang, Peoples R China

[2] East China Air Traff Management Bur, Dept Safety Management, Xiamen Air Traff Management Stn, Xiamen, Peoples R China

来源：

FRONTIERS IN NEUROROBOTICS | 2023年 / 17卷

基金：

中国国家自然科学基金;

关键词：

controller training; transfer learning; text generation; reinforcement learning; generalization; RECOGNITION; EXTRACTION; AGENT;

D O I：

10.3389/fnbot.2023.1285831

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Using computers to replace pilot seats in air traffic control (ATC) simulators is an effective way to improve controller training efficiency and reduce training costs. To achieve this, we propose a deep reinforcement learning model, RoBERTa-RL (RoBERTa with Reinforcement Learning), for generating pilot repetitions. RoBERTa-RL is based on the pre-trained language model RoBERTa and is optimized through transfer learning and reinforcement learning. Transfer learning is used to address the issue of scarce data in the ATC domain, while reinforcement learning algorithms are employed to optimize the RoBERTa model and overcome the limitations in model generalization caused by transfer learning. We selected a real-world area control dataset as the target task training and testing dataset, and a tower control dataset generated based on civil aviation radio land-air communication rules as the test dataset for evaluating model generalization. In terms of the ROUGE evaluation metrics, RoBERTa-RL achieved significant results on the area control dataset with ROUGE-1, ROUGE-2, and ROUGE-L scores of 0.9962, 0.992, and 0.996, respectively. On the tower control dataset, the scores were 0.982, 0.954, and 0.982, respectively. To overcome the limitations of ROUGE in this field, we conducted a detailed evaluation of the proposed model architecture using keyword-based evaluation criteria for the generated repetition instructions. This evaluation criterion calculates various keyword-based metrics based on the segmented results of the repetition instruction text. In the keyword-based evaluation criteria, the constructed model achieved an overall accuracy of 98.8% on the area control dataset and 81.8% on the tower control dataset. In terms of generalization, RoBERTa-RL improved accuracy by 56% compared to the model before improvement and achieved a 47.5% improvement compared to various comparative models. These results indicate that employing reinforcement learning strategies to enhance deep learning algorithms can effectively mitigate the issue of poor generalization in text generation tasks, and this approach holds promise for future application in other related domains.

引用

页数：13

共 50 条

[31] Research on Intelligent Control Method of Launch Vehicle Landing Based on Deep Reinforcement Learning
Xue, Shuai
Bai, Hongyang
Zhao, Daxiang
Zhou, Junyan
[J]. MATHEMATICS, 2023, 11 (20)
[32] Research and Application of Predictive Control Method Based on Deep Reinforcement Learning for HVAC Systems
Fu, Chenhui
Zhang, Yunhua
[J]. IEEE ACCESS, 2021, 9 (130845-130852): : 130845 - 130852
[33] Research on Resource Allocation Method of Space Information Networks Based on Deep Reinforcement Learning
Meng, Xiangli
Wu, Lingda
Yu, Shaobo
[J]. REMOTE SENSING, 2019, 11 (04)
[34] Manipulator Control Method Based on Deep Reinforcement Learning
Zeng, Rui
Liu, Manlu
Zhang, Junjun
Li, Xinmao
Zhou, Qijie
Jiang, Yuanchen
[J]. PROCEEDINGS OF THE 32ND 2020 CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2020), 2020, : 415 - 420
[35] Aircraft Control Method Based on Deep Reinforcement Learning
Zhen, Yan
Hao, Mingrui
[J]. PROCEEDINGS OF 2020 IEEE 9TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS CONFERENCE (DDCLS'20), 2020, : 912 - 917
[36] A Collision Avoidance Method Based on Deep Reinforcement Learning
Feng, Shumin
Sebastian, Bijo
Ben-Tzvi, Pinhas
[J]. ROBOTICS, 2021, 10 (02)
[37] Paraphrase Generation with Deep Reinforcement Learning
Li, Zichao
Jiang, Xin
Shang, Lifeng
Li, Hang
[J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 3865 - 3878
[38] Automatic Curriculum Design for Object Transportation Based on Deep Reinforcement Learning
Eoh, Gyuho
Park, Tae-Hyoung
[J]. IEEE ACCESS, 2021, 9 : 137281 - 137294
[39] Piano harmony automatic adaptation system based on deep reinforcement learning
Guo, Hui
[J]. ENTERTAINMENT COMPUTING, 2025, 52
[40] Automatic generation control of ubiquitous power Internet of Things integrated energy system based on deep reinforcement learning
Xi L.
Yu L.
Zhang X.
Hu W.
[J]. Xi, Lei (xilei2014@163.com), 1600, Chinese Academy of Sciences (50): : 221 - 234

← 1 2 3 4 5 →