A Machine Learning Framework for Domain Generation Algorithm-Based Malware Detection

被引:41
|
作者
Li, Yi [1 ]
Xiong, Kaiqi [1 ]
Chin, Tommy [2 ]
Hu, Chengbin [1 ]
机构
[1] Univ S Florida, Florida Ctr Cybersecur, Intelligent Comp Networking & Secur Lab, Tampa, FL 33620 USA
[2] Rochester Inst Technol, Dept Comp Secur, Rochester, NY 14623 USA
基金
美国国家科学基金会;
关键词
Malware; domain generation algorithm; machine learning; security; networking; BIG DATA;
D O I
10.1109/ACCESS.2019.2891588
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Attackers usually use a command and control (C2) server to manipulate the communication. In order to perform an attack, threat actors often employ a domain generation algorithm (DGA), which can allow malware to communicate with C2 by generating a variety of network locations. Traditional malware control methods, such as blacklisting, are insufficient to handle DGA threats. In this paper, we propose a machine learning framework for identifying and detecting DGA domains to alleviate the threat. We collect real-time threat data from the real-life traffic over a one-year period. We also propose a deep learning model to classify a large number of DGA domains. The proposed machine learning framework consists of a two-level model and a prediction model. In the two-level model, we first classify the DGA domains apart from normal domains and then use the clustering method to identify the algorithms that generate those DGA domains. In the prediction model, a time-series model is constructed to predict incoming domain features based on the hidden Markov model (HMM). Furthermore, we build a deep neural network (DNN) model to enhance the proposed machine learning framework by handling the huge dataset we gradually collected. Our extensive experimental results demonstrate the accuracy of the proposed framework and the DNN model. To be precise, we achieve an accuracy of 95.89% for the classification in the framework and 97.79% in the DNN model, 92.45% for the second-level clustering, and 95.21% for the HMM prediction in the framework.
引用
收藏
页码:32765 / 32782
页数:18
相关论文
共 50 条
  • [11] Malware Generation with Specific Behaviors to Improve Machine Learning-based Detection
    Smtith, Michael R.
    Verzi, Stephen J.
    Johnson, Nicholas T.
    Zhou, Xin
    Khanna, Kanad
    Quynn, Sophie
    Krishnakumar, Raga
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 2160 - 2169
  • [12] Android Malware Detection Based on Machine Learning
    Wang, Qing-Fei
    Fang, Xiang
    2018 4TH ANNUAL INTERNATIONAL CONFERENCE ON NETWORK AND INFORMATION SYSTEMS FOR COMPUTERS (ICNISC 2018), 2018, : 434 - 436
  • [13] Towards a Mobile Malware Detection Framework with the Support of Machine Learning
    Geneiatakis, Dimitris
    Baldini, Gianmarco
    Fovino, Igor Nai
    Vakalis, Ioannis
    SECURITY IN COMPUTER AND INFORMATION SCIENCES, EURO-CYBERSEC 2018, 2018, 821 : 119 - 129
  • [14] Comparison of Deep Learning and the Classical Machine Learning Algorithm for the Malware Detection
    Sewak, Mohit
    Sahay, Sanjay K.
    Rathore, Hemant
    2018 19TH IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNPD), 2018, : 293 - 296
  • [15] Learning-Based Artificial Algae Algorithm with Optimal Machine Learning Enabled Malware Detection
    Alalayah K.M.
    Alrayes F.S.
    Nour M.K.
    Alaidarous K.M.
    Alwayle I.M.
    Mohsen H.
    Ahmed I.A.
    Al Duhayyim M.
    Computer Systems Science and Engineering, 2023, 46 (03): : 3103 - 3119
  • [16] A Novel Malware Analysis Framework for Malware Detection and Classification using Machine Learning Approach
    Sethi, Kamalakanta
    Chaudhary, Shankar Kumar
    Tripathy, Bata Krishan
    Bera, Padmalochan
    ICDCN'18: PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING AND NETWORKING, 2018,
  • [17] Malware detection based on deep learning algorithm
    Ding Yuxin
    Zhu Siyi
    NEURAL COMPUTING & APPLICATIONS, 2019, 31 (02): : 461 - 472
  • [18] Malware detection based on deep learning algorithm
    Ding Yuxin
    Zhu Siyi
    Neural Computing and Applications, 2019, 31 : 461 - 472
  • [19] AMVG: Adaptive Malware Variant Generation Framework Using Machine Learning
    Choi, Jusop
    Shin, Dongsoon
    Kim, Hyoungshick
    Seotis, Jason
    Hong, Jin B.
    2019 IEEE 24TH PACIFIC RIM INTERNATIONAL SYMPOSIUM ON DEPENDABLE COMPUTING (PRDC 2019), 2019, : 246 - 255
  • [20] A Co-evolutionary Algorithm-Based Malware Adversarial Sample Generation Method
    Wang, Fangwei
    Lu, Yuanyuan
    Li, Qingru
    Wang, Changguang
    Bai, Yonglei
    2022 5TH IEEE CONFERENCE ON DEPENDABLE AND SECURE COMPUTING (IEEE DSC 2022), 2022,