A Machine Learning Framework for Domain Generation Algorithm-Based Malware Detection

被引:41
|
作者
Li, Yi [1 ]
Xiong, Kaiqi [1 ]
Chin, Tommy [2 ]
Hu, Chengbin [1 ]
机构
[1] Univ S Florida, Florida Ctr Cybersecur, Intelligent Comp Networking & Secur Lab, Tampa, FL 33620 USA
[2] Rochester Inst Technol, Dept Comp Secur, Rochester, NY 14623 USA
基金
美国国家科学基金会;
关键词
Malware; domain generation algorithm; machine learning; security; networking; BIG DATA;
D O I
10.1109/ACCESS.2019.2891588
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Attackers usually use a command and control (C2) server to manipulate the communication. In order to perform an attack, threat actors often employ a domain generation algorithm (DGA), which can allow malware to communicate with C2 by generating a variety of network locations. Traditional malware control methods, such as blacklisting, are insufficient to handle DGA threats. In this paper, we propose a machine learning framework for identifying and detecting DGA domains to alleviate the threat. We collect real-time threat data from the real-life traffic over a one-year period. We also propose a deep learning model to classify a large number of DGA domains. The proposed machine learning framework consists of a two-level model and a prediction model. In the two-level model, we first classify the DGA domains apart from normal domains and then use the clustering method to identify the algorithms that generate those DGA domains. In the prediction model, a time-series model is constructed to predict incoming domain features based on the hidden Markov model (HMM). Furthermore, we build a deep neural network (DNN) model to enhance the proposed machine learning framework by handling the huge dataset we gradually collected. Our extensive experimental results demonstrate the accuracy of the proposed framework and the DNN model. To be precise, we achieve an accuracy of 95.89% for the classification in the framework and 97.79% in the DNN model, 92.45% for the second-level clustering, and 95.21% for the HMM prediction in the framework.
引用
收藏
页码:32765 / 32782
页数:18
相关论文
共 50 条
  • [31] A Hypercuboid-Based Machine Learning Algorithm for Malware Classification
    Thi Thu Trang Nguyen
    Dai Tho Nguyen
    Duy Loi Vu
    2021 RIVF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION TECHNOLOGIES (RIVF 2021), 2021, : 301 - 306
  • [32] Windows Malware Detection Based on Cuckoo Sandbox Generated Report Using Machine Learning Algorithm
    Darshan, Shiva S. L.
    Kumara, Ajay M. A.
    Jaidhar, C. D.
    2016 11TH INTERNATIONAL CONFERENCE ON INDUSTRIAL AND INFORMATION SYSTEMS (ICIIS), 2016, : 534 - 539
  • [33] Android Malware Detection Using Genetic Algorithm based Optimized Feature Selection and Machine Learning
    Fatima, Anam
    Maurya, Ritesh
    Dutta, Malay Kishore
    Burget, Radim
    Masek, Jan
    2019 42ND INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2019, : 220 - 223
  • [34] Enhancing Machine Learning Based Malware Detection Model by Reinforcement Learning
    Wu, Cangshuai
    Shi, Jiangyong
    Yang, Yuexiang
    Li, Wenhua
    ICCNS 2018: PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON COMMUNICATION AND NETWORK SECURITY, 2018, : 74 - 78
  • [35] Application of Machine Learning in Malware Detection
    Van Quynh, Trinh
    Hien, Vu Thanh
    Nguyen, Vu Thanh
    Bao, Huynh Quoc
    FUTURE DATA AND SECURITY ENGINEERING. BIG DATA, SECURITY AND PRIVACY, SMART CITY AND INDUSTRY 4.0 APPLICATIONS, FDSE 2022, 2022, 1688 : 362 - 374
  • [36] IoT Malware Detection with Machine Learning
    Buttyan, Levente
    Ferenc, Rudolf
    ERCIM NEWS, 2022, (129): : 17 - 19
  • [37] Applications of Machine Learning in Malware Detection
    Vaduva, Jan-Alexandru
    Pasca, Vlad-Raul
    Florea, Iulia-Maria
    Rughinis, Razvan
    NEW TECHNOLOGIES AND REDESIGNING LEARNING SPACES, VOL II, 2019, : 286 - 293
  • [38] Malware Detection Using Machine Learning
    Kumar, Ajay
    Abhishek, Kumar
    Shah, Kunjal
    Patel, Divy
    Jain, Yash
    Chheda, Harsh
    Nerurka, Pranav
    KNOWLEDGE GRAPHS AND SEMANTIC WEB, KGSWC 2020, 2020, 1232 : 61 - 71
  • [39] Machine Learning in Wavelet Domain for Electromagnetic Emission Based Malware Analysis
    Chawla, Nikhil
    Kumar, Harshit
    Mukhopadhyay, Saibal
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2021, 16 : 3426 - 3441
  • [40] Compact feature hashing for machine learning based malware detection
    Moon, Damin
    Lee, JaeKoo
    Yoon, MyungKeun
    ICT EXPRESS, 2022, 8 (01): : 124 - 129