Automatic malware classification and new malware detection using machine learning

被引:63
|
作者
Liu, Liu [1 ]
Wang, Bao-sheng [1 ]
Yu, Bo [1 ]
Zhong, Qiu-xi [1 ]
机构
[1] Natl Univ Def Technol, Coll Comp, Changsha 410073, Hunan, Peoples R China
基金
中国国家自然科学基金;
关键词
Malware classification; Machine learning; n-gram; Gray-scale image; Feature extraction; Malware detection;
D O I
10.1631/FITEE.1601325
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The explosive growth of malware variants poses a major threat to information security. Traditional anti-virus systems based on signatures fail to classify unknown malware into their corresponding families and to detect new kinds of malware programs. Therefore, we propose a machine learning based malware analysis system, which is composed of three modules: data processing, decision making, and new malware detection. The data processing module deals with gray-scale images, Opcode n-gram, and import functions, which are employed to extract the features of the malware. The decision-making module uses the features to classify the malware and to identify suspicious malware. Finally, the detection module uses the shared nearest neighbor (SNN) clustering algorithm to discover new malware families. Our approach is evaluated on more than 20 000 malware instances, which were collected by Kingsoft, ESET NOD32, and Anubis. The results show that our system can effectively classify the unknown malware with a best accuracy of 98.9%, and successfully detects 86.7% of the new malware.
引用
收藏
页码:1336 / 1347
页数:12
相关论文
共 50 条
  • [31] Machine learning aided Android malware classification
    Milosevic, Nikola
    Dehghantanha, Ali
    Choo, Kitn-Kwang Raymond
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2017, 61 : 266 - 274
  • [32] Detecting Malware with Classification Machine Learning Techniques
    Yusof, Mohd Azahari Mohd
    Abdullah, Zubaile
    Ali, Firkhan Ali Hamid
    Sukri, Khairul Amin Mohamad
    Hussain, Hanizan Shaker
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (06) : 167 - 172
  • [33] The Use of Machine Learning Techniques to Advance the Detection and Classification of Unknown Malware
    Shhadat, Ihab
    Bataineh, Bara'
    Hayajneh, Amena
    Al-Sharif, Ziad A.
    [J]. 11TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT) / THE 3RD INTERNATIONAL CONFERENCE ON EMERGING DATA AND INDUSTRY 4.0 (EDI40) / AFFILIATED WORKSHOPS, 2020, 170 : 917 - 922
  • [34] Towards Explainable Quantum Machine Learning for Mobile Malware Detection and Classification
    Mercaldo, Francesco
    Ciaramella, Giovanni
    Iadarola, Giacomo
    Storto, Marco
    Martinelli, Fabio
    Santone, Antonella
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (23):
  • [35] Malware Classification System Based on Machine Learning
    Qu Wei
    Shi Xiao
    Li Dongbao
    [J]. PROCEEDINGS OF THE 2019 31ST CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2019), 2019, : 647 - 652
  • [36] AndyWar: an intelligent android malware detection using machine learning
    Roy, Sandipan
    Bhanja, Samit
    Das, Abhishek
    [J]. INNOVATIONS IN SYSTEMS AND SOFTWARE ENGINEERING, 2023,
  • [37] Backdoor Malware Detection in Industrial IoT Using Machine Learning
    Khan, Maryam Mahsal
    Buriro, Attaullah
    Ahmad, Tahir
    Ullah, Subhan
    [J]. Computers, Materials and Continua, 2024, 81 (03): : 4691 - 4705
  • [38] Androhealthcheck: A malware detection system for android using machine learning
    Agrawal P.
    Trivedi B.
    [J]. Lecture Notes on Data Engineering and Communications Technologies, 2021, 66 : 35 - 41
  • [39] Comprehensive Behaviour of Malware Detection Using the Machine Learning Classifier
    Asha, P.
    Lahari, T.
    Kavya, B.
    [J]. SOFT COMPUTING SYSTEMS, ICSCS 2018, 2018, 837 : 462 - 469
  • [40] Hardware-Assisted Malware Detection using Machine Learning
    Pan, Zhixin
    Sheldon, Jennifer
    Sudusinghe, Chamika
    Charles, Subodha
    Mishra, Prabhat
    [J]. PROCEEDINGS OF THE 2021 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2021), 2021, : 1775 - 1780