Primal-Dual Deep Reinforcement Learning for Periodic Coverage-Assisted UAV Secure Communications

被引：0

作者：

Qin, Yunhui ^{[1
]}

Xing, Zhifang ^{[1
,3
]}

Li, Xulong ^{[2
]}

Zhang, Zhongshan ^{[3
]}

Zhang, Haijun ^{[2
]}

机构：

[1] Univ Sci & Technol Beijing, Natl Sch Elite Engn, Beijing 100083, Peoples R China

[2] Univ Sci & Technol Beijing, Beijing Engn & Technol Res Ctr Convergence Network, Beijing, Peoples R China

[3] Beijing Inst Technol, Sch Cyberspace Sci & Technol, Beijing 100081, Peoples R China

来源：

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY | 2024年 / 73卷 / 12期

基金：

中国国家自然科学基金; 中国博士后科学基金;

关键词：

Autonomous aerial vehicles; Jamming; Optimization; Trajectory; Resource management; Security; Communication system security; Unmanned aerial vehicle (UAV); periodic coverage evaluation; primal-dual optimization; deep reinforcement learning; constrained Markov decision process; RESOURCE-ALLOCATION; TRAJECTORY DESIGN; SECRECY; ENERGY;

D O I：

10.1109/TVT.2024.3450956

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Considering the UAVs' energy constraints and green communication requirements, this paper proposes a periodic coverage-assisted UAV secure communication system to maximize the worst-case average achievable secrecy rate.UAV base stations serve legitimate users while UAV jammers periodically dispatch interference signals to eavesdroppers. User scheduling, UAV trajectory and power allocation are modeled as a constrained Markov decision problem with coverage evaluation constraint. Then, the joint optimization of user scheduling, UAV trajectory and power allocation is achieved by the primal-dual soft actor-critic (SAC) algorithm. Specifically, the reward critic network assesses the secrecy rate and the cost critic network fits the coverage constraint. Meanwhile, the actor network generates the user scheduling, UAV trajectory and power allocation policy while updating the dual variables. For comparison, we also adopt other deep reinforcement learning (DRL) solutions namely the SAC algorithm and the twin-delayed deep deterministic policy gradient (TD3) as well as the traditional random method and greedy method. Simulation results show that the proposed algorithm performs best in the training speed, the reward performance and the secrecy rate.

引用

页码：19641 / 19652

页数：12

共 50 条

[1] A projected primal-dual gradient optimal control method for deep reinforcement learning
Simon Gottschalk
Michael Burger
Matthias Gerdts
Journal of Mathematics in Industry, 10
[2] A projected primal-dual gradient optimal control method for deep reinforcement learning
Gottschalk, Simon
Burger, Michael
Gerdts, Matthias
JOURNAL OF MATHEMATICS IN INDUSTRY, 2020, 10 (01)
[3] Multi-Agent Deep Reinforcement Learning for Secure UAV Communications
Zhang, Yu
Zhuang, Zirui
Gao, Feifei
Wang, Jingyu
Han, Zhu
2020 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2020,
[4] A Primal-Dual Formulation for Deep Learning with Constraints
Nandwani, Yatin
Pathak, Abhishek
Mausam
Singla, Parag
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[5] Offline Primal-Dual Reinforcement Learning for Linear MDPs
Gabbianelli, Germano
Neu, Gergely
Okolo, Nneka
Papini, Matteo
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
[6] Indoor Periodic Fingerprint Collections by Vehicular Crowdsensing via Primal-Dual Multi-Agent Deep Reinforcement Learning
Yang, Haoming
Zhao, Qiran
Wang, Hao
Liu, Chi Harold
Li, Guozheng
Wang, Guoren
Tang, Jian
Wu, Dapeng
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2024, 42 (10) : 2625 - 2641
[7] Deep Reinforcement Learning for IRS-Assisted UAV Covert Communications
Songjiao Bi
Langtao Hu
Quanjin Liu
Jianlan Wu
Rui Yang
Lei Wu
ChinaCommunications, 2023, 20 (12) : 131 - 141
[8] Deep Reinforcement Learning for Deception in IRS-assisted UAV Communications
Olowononi, Felix O.
Rawat, Danda B.
Kamhoua, Charles A.
Sadler, Brian M.
Proceedings - IEEE Military Communications Conference MILCOM, 2022, 2022-November : 763 - 768
[9] Deep reinforcement learning for IRS-assisted UAV covert communications
Bi, Songjiao
Hu, Langtao
Liu, Quanjin
Wu, Jianlan
Yang, Rui
Wu, Lei
CHINA COMMUNICATIONS, 2023, 20 (12) : 131 - 141
[10] Deep Reinforcement Learning for Deception in IRS-assisted UAV Communications
Olowononi, Felix O.
Rawat, Danda B.
Kamhoua, Charles A.
Sadler, Brian M.
2022 IEEE MILITARY COMMUNICATIONS CONFERENCE (MILCOM), 2022,

← 1 2 3 4 5 →