Malware Classification Using Probability Scoring and Machine Learning

被引:25
|
作者
Xue, Di [1 ]
Li, Jingmei [1 ]
Lv, Tu [1 ]
Wu, Weifei [1 ]
Wang, Jiaxiang [1 ]
机构
[1] Harbin Engn Univ, Coll Comp Sci & Tc & Mol, Harbin 150001, Heilongjiang, Peoples R China
关键词
Grayscale image; native API call; malware; machine learning; probability scoring; static and dynamic analysis; NETWORKS;
D O I
10.1109/ACCESS.2019.2927552
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Malware classification plays an important role in tracing the attack sources of computer security. However, existing static analysis methods are fast in classification, but they are inefficient in some malware using packing and obfuscation techniques; the dynamic analysis methods have better universality for packing and obfuscation, but they will cause excessive classification cost. To overcome these shortcomings, in this paper, we propose a classification system Malscore based on the probability scoring and machine learning, which sets the probability threshold to concatenate static analysis (called Phase 1) and dynamic analysis (called Phase 2). The convolutional neural networks with spatial pyramid pooling were used to analyze the grayscale images (static features) in Phase 1, and the variable n-grams and machine learning were used to analyze the native API call sequences (dynamic features) in Phase 2. Malscore combined static analysis with dynamic analysis not only accelerated the static analysis process by taking advantage of the CNN in image recognition but also appeared to be more resilient to obfuscation by the dynamic analysis. Different from other static and dynamic analysis techniques, when malware is detected, due to the fact that malware will most likely be labeled only by static analysis, we could reduce the overheads by dynamically analyzing a few malware that has less obvious features or greater confusion in static analysis. We performed experiments on 174 607 malware samples from 63 malware families. The result showed that Malscore achieved 98.82% accuracy for malware classification. Furthermore, Malscore was compared with the method of using static and dynamic analysis. The preprocessing and test time represented a reduction of 59.58% and 61.70%, respectively.
引用
收藏
页码:91641 / 91656
页数:16
相关论文
共 50 条
  • [31] Robust IoT Malware Detection and Classification Using Opcode Category Features on Machine Learning
    Lee, Hyunjong
    Kim, Sooin
    Baek, Dongheon
    Kim, Donghoon
    Hwang, Doosung
    [J]. IEEE ACCESS, 2023, 11 : 18855 - 18867
  • [32] Android Malware Classification Using Machine Learning and Bio-Inspired Optimisation Algorithms
    Pye, Jack
    Issac, Biju
    Aslam, Nauman
    Rafiq, Husnain
    [J]. 2020 IEEE 19TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2020), 2020, : 1777 - 1782
  • [33] Machine learning based Malware Classification for Android Applications using Multimodal Image Representations
    Kumar, Ajit
    Sagar, Pramod K.
    Kuppusamy, K. S.
    Aghila, G.
    [J]. PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND CONTROL (ISCO'16), 2016,
  • [34] Malware Classification of Portable Executables using Tree-Based Ensemble Machine Learning
    Atluri, Venkata
    [J]. 2019 IEEE SOUTHEASTCON, 2019,
  • [35] A Hypercuboid-Based Machine Learning Algorithm for Malware Classification
    Thi Thu Trang Nguyen
    Dai Tho Nguyen
    Duy Loi Vu
    [J]. 2021 RIVF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION TECHNOLOGIES (RIVF 2021), 2021, : 301 - 306
  • [36] Integrated Malware Analysis Using Machine Learning
    Singh, Akash Kumar
    Jain, Aruna
    [J]. 2017 2ND INTERNATIONAL CONFERENCE ON TELECOMMUNICATION AND NETWORKS (TEL-NET), 2017, : 347 - 354
  • [37] To Identify Malware Using Machine Learning Algorithms
    Pujari, Shivam
    Mandoria, H. L.
    Shrivastava, R. P.
    Singh, Rajesh
    [J]. COMPUTING SCIENCE, COMMUNICATION AND SECURITY, 2022, 1604 : 117 - 127
  • [38] Evolutionary feature selection for machine learning based malware classification
    Kale, Gulsade
    Bostanci, Gazi Erkan
    Celebi, Fatih Vehbi
    [J]. ENGINEERING SCIENCE AND TECHNOLOGY-AN INTERNATIONAL JOURNAL-JESTECH, 2024, 56
  • [39] Vulnerability Assessment of Machine Learning Based Malware Classification Models
    Raju, Godwin
    Zavarsky, Pavol
    Makanju, Adetokunbo
    Malik, Yasir
    [J]. PROCEEDINGS OF THE 2019 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION (GECCCO'19 COMPANION), 2019, : 1615 - 1618
  • [40] Behavior Analysis of Malware Using Machine Learning
    Dhammi, Arshi
    Singh, Maninder
    [J]. 2015 EIGHTH INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING (IC3), 2015, : 481 - 486