Malicious domain detection based on semi-supervised learning and parameter optimization

被引:1
|
作者
Liao, Renjie [1 ]
Wang, Shuo [1 ,2 ]
机构
[1] Xian Satellite Control Ctr, State Key Lab Astronaut Dynam, Xian, Peoples R China
[2] Xidian Univ, State Key Lab Integrated Serv Networks, Xian 710071, Peoples R China
关键词
computer networks; internet; safety; telecommunication security; DNS;
D O I
10.1049/cmu2.12739
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Malicious domains provide malware with covert communication channels which poses a severe threat to cybersecurity. Despite the continuous progress in detecting malicious domains with various machine learning algorithms, maintaining up-to-date various samples with fine-labeled data for training is difficult. To handle these issues and improve the detection accuracy, a novel malicious domain detection method named MDND-SS-PO is proposed that combines semi-supervised learning and parameter optimization. The contributions of the study are as follows. First, the method extracts the statistical features of the IP address, TTL value, the NXDomain record, and the domain name query characteristics to discriminate Domain-Flux and Fast-Flux domain names simultaneously. Second, an improved DBSCAN based on the neighborhood division is designed to cluster labeled data and unlabeled data with low time consumption. Then, based on the clustering hypothesis, unlabeled data is tagged with pseudo-label according to the cluster results, which aims to train a supervised classifier effectively. Finally, Gaussian process regression is used to optimize parameter settings of the algorithm. And the Silhouette index and F1 score are introduced to evaluate the optimization results. Experimental results show that the proposed method achieved a precise detection performance of 0.885 when the ratio of labeled data is 5%. Malicious domains pose a severe threat to cybersecurity. As to improve the detection accuracy when the malicious domain variants increase, we proposed a novel malicious domain detection method named MDND-SS-PO that combines semi-supervised learning and parameter optimization. The method extracts the statistical features of the IP address, TTL value, the NXDomain record, and the domain name query characteristics to discriminate Domain-Flux and Fast-Flux domain names simultaneously. And an improved DBSCAN based on the neighborhood division is applied semi-supervised learning with less label efforts. Finally, Gaussian process regression is used to optimize parameter settings of machine learning algorithms. Experimental results show that the proposed method achieved a precise detection performance of 0.885 when the ratio of labeled data is 5%. image
引用
收藏
页码:386 / 397
页数:12
相关论文
共 50 条
  • [1] Semi-supervised Malicious Domain Detection Based on Meta Pseudo Labeling
    Gao, Yi
    Yuan, Fangfang
    Yang, Jinglin
    Wang, Dakui
    Cao, Cong
    Liu, Yanbing
    [J]. COMPUTATIONAL SCIENCE, ICCS 2024, PT II, 2024, 14833 : 312 - 324
  • [2] Semi-supervised learning approach for malicious URL detection via adversarial learning
    Ling, Jie
    Xiong, Su
    Luo, Yu
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2021, 41 (02) : 3083 - 3092
  • [3] Semi-supervised machine learning approach for unknown malicious software detection
    Bisio, Federica
    Gastaldo, Paolo
    Zunino, Rodolfo
    Decherchi, Sergio
    [J]. 2014 IEEE INTERNATIONAL SYMPOSIUM ON INNOVATIONS IN INTELLIGENT SYSTEMS AND APPLICATIONS (INISTA 2014), 2014, : 52 - 59
  • [4] A Payload Based Malicious HTTP Traffic Detection Method Using Transfer Semi-Supervised Learning
    Chen, Tieming
    Chen, Yunpeng
    Lv, Mingqi
    He, Gongxun
    Zhu, Tiantian
    Wang, Ting
    Weng, Zhengqiu
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (16):
  • [5] A DOMAIN ADAPTATION METHOD FOR OBJECT DETECTION IN UAV BASED ON SEMI-SUPERVISED LEARNING
    Li, Siqi
    Liu, Biyuan
    Chen, Huaixin
    Huang, Zhou
    [J]. 2020 17TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2020, : 138 - 141
  • [6] Learning Semi-Supervised Representation Towards a Unified Optimization Framework for Semi-Supervised Learning
    Li, Chun-Guang
    Lin, Zhouchen
    Zhang, Honggang
    Guo, Jun
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 2767 - 2775
  • [7] The Optimization Method of Wireless Network Attacks Detection Based on Semi-Supervised Learning
    Wang, Ting
    Wang, Na
    Cui, Yunpeng
    Li, Huan
    [J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2020, 57 (04): : 791 - 802
  • [8] Optimization approaches for semi-supervised learning
    Yajima, Y
    Hoshiba, T
    [J]. ICMLA 2005: FOURTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2005, : 247 - 252
  • [9] Optimization approaches to semi-supervised learning
    Demiriz, A
    Bennett, KP
    [J]. COMPLEMENTARITY: APPLICATIONS, ALGORITHMS AND EXTENSIONS, 2001, 50 : 121 - 141
  • [10] Cast Shadow Detection Based on Semi-supervised Learning
    Jarraya, Salma Kammoun
    Boukhriss, Rania Rebai
    Hammami, Mohamed
    Ben-Abdallah, Hanene
    [J]. IMAGE ANALYSIS AND RECOGNITION, PT I, 2012, 7324 : 19 - 26