Efficient Jamming Policy Generation Method Based on Multi-Timescale Ensemble Q-Learning

被引:0
|
作者
Qian, Jialong [1 ]
Zhou, Qingsong [1 ]
Li, Zhihui [1 ]
Yang, Zhongping [1 ]
Shi, Shasha [1 ]
Xu, Zhenjia [1 ]
Xu, Qiyun [2 ]
机构
[1] Natl Univ Def Technol, Coll Elect Engn, Hefei 230037, Peoples R China
[2] PLA, Unit 93216, Beijing 100085, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
jamming policy generation; multifunctional radar; Q-learning; multi-timescale ensemble; RADAR;
D O I
10.3390/rs16173158
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
With the advancement of radar technology toward multifunctionality and cognitive capabilities, traditional radar countermeasures are no longer sufficient to meet the demands of countering the advanced multifunctional radar (MFR) systems. Rapid and accurate generation of the optimal jamming strategy is one of the key technologies for efficiently completing radar countermeasures. To enhance the efficiency and accuracy of jamming policy generation, an efficient jamming policy generation method based on multi-timescale ensemble Q-learning (MTEQL) is proposed in this paper. First, the task of generating jamming strategies is framed as a Markov decision process (MDP) by constructing a countermeasure scenario between the jammer and radar, while analyzing the principle radar operation mode transitions. Then, multiple structure-dependent Markov environments are created based on the real-world adversarial interactions between jammers and radars. Q-learning algorithms are executed concurrently in these environments, and their results are merged through an adaptive weighting mechanism that utilizes the Jensen-Shannon divergence (JSD). Ultimately, a low-complexity and near-optimal jamming policy is derived. Simulation results indicate that the proposed method has superior jamming policy generation performance compared with the Q-learning algorithm, in terms of the short jamming decision-making time and low average strategy error rate.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] Multi-Timescale Ensemble Q-Learning for Markov Decision Process Policy Optimization
    Bozkus, Talha
    Mitra, Urbashi
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2024, 72 : 1427 - 1442
  • [2] An Enhanced Ensemble Learning Method for Sentiment Analysis based on Q-learning
    Savargiv, Mohammad
    Masoumi, Behrooz
    Keyvanpour, Mohammad Reza
    IRANIAN JOURNAL OF SCIENCE AND TECHNOLOGY-TRANSACTIONS OF ELECTRICAL ENGINEERING, 2024, 48 (03) : 1261 - 1277
  • [3] Non-Stationary Policy Learning for Multi-Timescale Multi-Agent Reinforcement Learning
    Emami, Patrick
    Zhang, Xiangyu
    Biagioni, David
    Zamzam, Ahmed S.
    2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 2372 - 2378
  • [4] A Q-learning-based Multi-timescale Resilience Enhancement Approach for Power Grids with High Renewables
    Huang, Yanting
    Zhong, Qing
    Wang, Akang
    Lin, Shunjiang
    Peng, Chaoyi
    Lei, Shunbo
    2024 IEEE 2ND INTERNATIONAL CONFERENCE ON POWER SCIENCE AND TECHNOLOGY, ICPST 2024, 2024, : 1919 - 1924
  • [5] Design of cognitive radar jamming based on Q-learning algorithm
    Li, Yun-Jie
    Zhu, Yun-Peng
    Gao, Mei-Guo
    Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology, 2015, 35 (11): : 1194 - 1199
  • [6] Q-learning intelligent jamming decision algorithm based on efficient upper confidence bound variance
    Rao N.
    Xu H.
    Song B.
    Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology, 2022, 54 (05): : 162 - 170
  • [7] A Multi-Parameter Intelligent Communication Anti-Jamming Method Based on Three-Dimensional Q-Learning
    Pu, Ziming
    Niu, Yingtao
    Zhang, Guoliang
    2022 IEEE 2ND INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND ARTIFICIAL INTELLIGENCE (CCAI 2022), 2022, : 205 - 210
  • [8] Optimal method for the generation of the attack path based on the Q-learning decision
    Li T.
    Cao S.
    Yin S.
    Wei D.
    Ma X.
    Ma J.
    Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2021, 48 (01): : 160 - 167
  • [9] Q-Learning with probability based action policy
    Ugurlu, Ekin Su
    Biricik, Goksel
    2006 IEEE 14TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS, VOLS 1 AND 2, 2006, : 210 - +
  • [10] Cooperative Q-Learning Based on Maturity of the Policy
    Yang, Mao
    Tian, Yantao
    Liu, Xiaomei
    2009 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION, VOLS 1-7, CONFERENCE PROCEEDINGS, 2009, : 1352 - 1356