Android malware classification using optimum feature selection and ensemble machine learning

被引:0
|
作者
Islam R. [2 ]
Sayed M.I. [1 ]
Saha S. [1 ]
Hossain M.J. [2 ]
Masud M.A. [2 ]
机构
[1] Computer Science, Western University, London, ON
[2] Computer Science and Information Technology, Patuakhali Science and Technology University, Dumki, Patuakhali
关键词
Android; Category classification; Dynamic analysis; Ensemble; Malware; Supervised ML;
D O I
10.1016/j.iotcps.2023.03.001
中图分类号
学科分类号
摘要
The majority of smartphones on the market run on the Android operating system. Security has been a core concern with this platform since it allows users to install apps from unknown sources. With thousands of apps being produced and launched daily, malware detection using Machine Learning (ML) has attracted significant attention compared to traditional detection techniques. Despite academic and commercial efforts, developing an efficient and reliable method for classifying malware remains challenging. As a result, several datasets for malware analysis have been generated and made available during the past ten years. These datasets may contain static features, such as API calls, intents, and permissions, or dynamic features, like logcat errors, shared memory, and system calls. Dynamic analysis is more resilient when it comes to code obfuscation. Though binary classification and multi-classification have been carried out in recent studies, the latter provides valuable insight into the nature of malware. Because each malware variant operates differently, identifying its category might help prevent it. Using the well-known ensemble ML approach called weighted voting, this study performed dynamic feature analysis for multi-classification. Random Forest, K-nearest Neighbors, Multi-Level Perceptrons, Decision Trees, Support Vector Machines, and Logistic Regression are all studied in this ensemble model. We used a recent dataset named CCCS-CIC-AndMal-2020, which contains an extensive collection of Android applications and malware samples. A well-researched data preparation phase followed by weighted voting based on R2 scores of the ML classifiers presents an accuracy of 95.0% even after excluding 60.2% features, outperforming all recent studies. © 2023 The Authors
引用
收藏
页码:100 / 111
页数:11
相关论文
共 50 条
  • [31] Classification of lung cancer using ensemble-based feature selection and machine learning methods
    Cai, Zhihua
    Xu, Dong
    Zhang, Qing
    Zhang, Jiexia
    Ngai, Sai-Ming
    Shao, Jianlin
    [J]. MOLECULAR BIOSYSTEMS, 2015, 11 (03) : 791 - 800
  • [32] Native Malware Detection in Smartphones with Android OS Using Static Analysis, Feature Selection and Ensemble Classifiers
    Morales-Ortega, S.
    Escamilla-Ambrosio, P. J.
    Rodriguez-Mota, A.
    Coronado-De-Alba, L. D.
    [J]. 2016 11TH INTERNATIONAL CONFERENCE ON MALICIOUS AND UNWANTED SOFTWARE (MALWARE), 2016, : 67 - 74
  • [33] An Ensemble Approach Based on Fuzzy Logic Using Machine Learning Classifiers for Android Malware Detection
    Atacak, Ismail
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (03):
  • [34] Malware Classification Using Machine Learning
    Savard, Nolan
    Feinauer, David M.
    Alghazo, Jaafar M.
    Abdelhamid, Sherif E.
    [J]. SOUTHEASTCON 2024, 2024, : 843 - 847
  • [35] Evaluation and classification of obfuscated Android malware through deep learning using ensemble voting mechanism
    Aurangzeb, Sana
    Aleem, Muhammad
    [J]. SCIENTIFIC REPORTS, 2023, 13 (01):
  • [36] Evaluation and classification of obfuscated Android malware through deep learning using ensemble voting mechanism
    Sana Aurangzeb
    Muhammad Aleem
    [J]. Scientific Reports, 13 (1)
  • [37] High accuracy android malware detection using ensemble learning
    Yerima, Suleiman Y.
    Sezer, Sakir
    Muttik, Igor
    [J]. IET INFORMATION SECURITY, 2015, 9 (06) : 313 - 320
  • [38] Android Malware Classification Using Machine Learning and Bio-Inspired Optimisation Algorithms
    Pye, Jack
    Issac, Biju
    Aslam, Nauman
    Rafiq, Husnain
    [J]. 2020 IEEE 19TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2020), 2020, : 1777 - 1782
  • [39] Machine learning based Malware Classification for Android Applications using Multimodal Image Representations
    Kumar, Ajit
    Sagar, Pramod K.
    Kuppusamy, K. S.
    Aghila, G.
    [J]. PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND CONTROL (ISCO'16), 2016,
  • [40] Android Malware Detection Using Machine Learning Technique
    Sabri, Nor ‘Afifah
    Khamis, Shakiroh
    Zainudin, Zanariah
    [J]. Lecture Notes on Data Engineering and Communications Technologies, 2024, 211 : 153 - 164