Employing combined spatial and frequency domain image features for machine learning-based malware detection

被引:0
|
作者
Bashar, Abul [1 ]
机构
[1] Prince Mohammad Bin Fahd Univ, Dept Comp Engn, Khobar 31952, Saudi Arabia
来源
ELECTRONIC RESEARCH ARCHIVE | 2024年 / 32卷 / 07期
关键词
image-based data; spatial and frequency domain; malware identification; machine learning classifiers; feature extraction; feature hybridization; FRAMEWORK;
D O I
10.3934/era.2024192
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
The ubiquitous adoption of Android devices has unfortunately brought a surge in malware threats, compromising user data, privacy concerns, and financial and device integrity, to name a few. To combat this, numerous efforts have explored automated botnet detection mechanisms, with anomalybased approaches leveraging machine learning (ML) gaining attraction due to their signature-agnostic nature. However, the problem lies in devising accurate ML models which capture the ever evolving landscape of malwares by effectively leveraging all the possible features from Android application packages (APKs).This paper delved into this domain by proposing, implementing, and evaluating an imagebased Android malware detection (AMD) framework that harnessed the power of feature hybridization. The core idea of this framework was the conversion of text-based data extracted from Android APKs into grayscale images. The novelty aspect of this work lied in the unique image feature extraction strategies and their subsequent hybridization to achieve accurate malware classification using ML models. More specifically, four distinct feature extraction methodologies, namely, Texture and histogram of oriented gradients (HOG) from spatial domain, and discrete wavelet transform (DWT) and Gabor from the frequency domain were employed to hybridize the features for improved malware identification. To this end, three image-based datasets, namely, Dex, Manifest, and Composite, derived from the information security centre of excellence (ISCX) Android Malware dataset, were leveraged to evaluate the optimal data source for botnet classification. Popular ML classifiers, including naive Bayes (NB), multilayer perceptron (MLP), support vector machine (SVM), and random forest (RF), were employed for the classification task. The experimental results demonstrated the efficacy of the proposed framework, achieving a peak classification accuracy of 93.03% and recall of 97.1% for the RF classifier using the Manifest dataset and a combination of Texture and HOG features. These findings validate the proof-of-concept and provide valuable insights for researchers exploring ML/deep learning (DL) approaches in the domain of AMD.
引用
收藏
页码:4255 / 4290
页数:36
相关论文
共 50 条
  • [21] A new machine learning-based method for android malware detection on imbalanced dataset
    Diyana Tehrany Dehkordy
    Abbas Rasoolzadegan
    Multimedia Tools and Applications, 2021, 80 : 24533 - 24554
  • [22] Empirical Analysis of Learning-based Malware Detection Methods using Image Visualization
    Sheneamer, Abdullah
    Alhazmi, Essa
    Henrydoss, James
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (04) : 925 - 936
  • [23] A Machine Learning Framework for Domain Generation Algorithm-Based Malware Detection
    Li, Yi
    Xiong, Kaiqi
    Chin, Tommy
    Hu, Chengbin
    IEEE ACCESS, 2019, 7 : 32765 - 32782
  • [24] Malware Detection Using Machine Learning Based on the Combination of Dynamic and Static Features
    Zhao, Jingling
    Zhang, Suoxing
    Liu, Bohan
    Cui, Baojiang
    2018 27TH INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND NETWORKS (ICCCN), 2018,
  • [25] A Review on Machine Learning-based Malware Detection Techniques for Internet of Things (IoT) Environments
    Sasikala, S.
    Janakiraman, Sengathir
    WIRELESS PERSONAL COMMUNICATIONS, 2023, 132 (03) : 1961 - 1974
  • [26] Customized Machine Learning-Based Hardware-Assisted Malware Detection in Embedded Devices
    Sayadi, Hossein
    Makrani, Hosein Mohammadi
    Randive, Onkar
    Manoj, Sai P. D.
    Rafatirad, Setareh
    Homayoun, Houman
    2018 17TH IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (IEEE TRUSTCOM) / 12TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA SCIENCE AND ENGINEERING (IEEE BIGDATASE), 2018, : 1685 - 1688
  • [27] Image Saliency Detection Algorithm Based on Spatial and Frequency Domain
    Sun, Xiaofei
    Pan, Wenwen
    Yuan, Wei
    Wang, Lei
    Yang, Bin
    Wang, Xia
    PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON VIRTUAL REALITY (ICVR 2018), 2018, : 89 - 94
  • [28] A Review on Machine Learning-based Malware Detection Techniques for Internet of Things (IoT) Environments
    S. Sasikala
    Sengathir Janakiraman
    Wireless Personal Communications, 2023, 132 (3) : 1961 - 1974
  • [29] A Machine Learning-Based Approach for Spatial Estimation Using the Spatial Features of Coordinate Information
    Ahn, Seongin
    Ryu, Dong-Woo
    Lee, Sangho
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2020, 9 (10)
  • [30] Android Malware Detection Based on Machine Learning
    Wang, Qing-Fei
    Fang, Xiang
    2018 4TH ANNUAL INTERNATIONAL CONFERENCE ON NETWORK AND INFORMATION SYSTEMS FOR COMPUTERS (ICNISC 2018), 2018, : 434 - 436