Detecting new obfuscated malware variants: A lightweight and interpretable machine learning approach

被引:0
|
作者
Madamidola, Oladipo A. [1 ]
Ngobigha, Felix [1 ]
Ez-zizi, Adnane [1 ]
机构
[1] Univ Suffolk, Waterfront Bldg, Ipswich IP4 1QJ, England
来源
关键词
Cyber security; Obfuscated malware; Detection of unknown malware; Machine learning; Explainable machine learning;
D O I
10.1016/j.iswa.2024.200472
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Machine learning has been successfully applied in developing malware detection systems, with a primary focus on accuracy, and increasing attention to reducing computational overhead and improving model interpretability. However, an important question remains underexplored: How well can machine learning-based models detect entirely new forms of malware not present in the training data? In this study, we present a machine learningbased system for detecting obfuscated malware that is not only highly accurate, lightweight and interpretable, but also capable of successfully adapting to new types of malware attacks. Our system is capable of detecting 15 malware subtypes despite being exclusively trained on one malware subtype, namely the Transponder from the Spyware family. This system was built after training 15 distinct random forest-based models, each on a different malware subtype from the CIC-MalMem-2022 dataset. These models were evaluated against the entire range of malware subtypes, including all unseen malware subtypes. To maintain the system's streamlined nature, training was confined to the top five most important features, which also enhanced interpretability. The Transponderfocused model exhibited high accuracy, exceeding 99.8%, with an average processing speed of 5.7 mu s per file. We also illustrate how the Shapley additive explanations technique can facilitate the interpretation of the model predictions. Our research contributes to advancing malware detection methodologies, pioneering the feasibility of detecting obfuscated malware by exclusively training a model on a single or a few carefully selected malware subtype and applying it to detect unseen subtypes.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] StormDroid: A Streaminglized Machine Learning-Based System for Detecting Android Malware
    Chen, Sen
    Xue, Minhui
    Tang, Zhushou
    Xu, Lihua
    Zhu, Haojin
    ASIA CCS'16: PROCEEDINGS OF THE 11TH ACM ASIA CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2016, : 377 - 388
  • [42] Method of Detecting Malware Through Analysis of Opcodes Frequency with Machine Learning Technique
    Woo, Sang-Uk
    Kim, Dong-Hee
    Chung, Tai-Myoung
    ADVANCES IN COMPUTER SCIENCE AND UBIQUITOUS COMPUTING, 2017, 421 : 1019 - 1024
  • [43] An optimized positive-unlabeled learning method for detecting a large scale of malware variants
    Zhang, Jixin
    Khan, Mohammad Faham
    Lin, Xiaodong
    Qin, Zheng
    2019 IEEE CONFERENCE ON DEPENDABLE AND SECURE COMPUTING (DSC), 2019, : 182 - 189
  • [44] Detecting Malware Families and Subfamilies using Machine Learning Algorithms: An Empirical Study
    Odat, Esraa
    Alazzam, Batool
    Yaseen, Qussai M.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (02) : 761 - 765
  • [45] Towards Deep Learning-Based Approach for Detecting Android Malware
    Booz, Jarrett
    McGiff, Josh
    Hatcher, William
    Yu, Wei
    Nguyen, James
    Lu, Chao
    INTERNATIONAL JOURNAL OF SOFTWARE INNOVATION, 2019, 7 (04) : 1 - 24
  • [46] Detecting epileptic seizures using machine learning and interpretable features of human EEG
    Oleg E. Karpov
    Sergey Afinogenov
    Vadim V. Grubov
    Vladimir Maksimenko
    Sergey Korchagin
    Nikita Utyashev
    Alexander E. Hramov
    The European Physical Journal Special Topics, 2023, 232 : 673 - 682
  • [47] A Novel approach for detecting malware in Android applications using Deep learning
    Kaushik, Prashant
    Yadav, Pankaj K.
    2018 ELEVENTH INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING (IC3), 2018, : 59 - 62
  • [48] Detecting Cryptomining Malware: a Deep Learning Approach for Static and Dynamic Analysis
    Darabian, Hamid
    Homayounoot, Sajad
    Dehghantanha, Ali
    Hashemi, Sattar
    Karimipour, Hadis
    Parizi, Reza M.
    Choo, Kim-Kwang Raymond
    JOURNAL OF GRID COMPUTING, 2020, 18 (02) : 293 - 303
  • [49] Lightweight On-Device Detection of Android Malware Based on the Koodous Platform and Machine Learning
    Krzyszton, Mateusz
    Bok, Bartosz
    Lew, Marcin
    Sikora, Andrzej
    SENSORS, 2022, 22 (17)
  • [50] Detecting epileptic seizures using machine learning and interpretable features of human EEG
    Karpov, Oleg E.
    Afinogenov, Sergey
    Grubov, Vadim V.
    Maksimenko, Vladimir
    Korchagin, Sergey
    Utyashev, Nikita
    Hramov, Alexander E.
    EUROPEAN PHYSICAL JOURNAL-SPECIAL TOPICS, 2023, 232 (05): : 673 - 682