Employing combined spatial and frequency domain image features for machine learning-based malware detection

被引:0
|
作者
Bashar, Abul [1 ]
机构
[1] Prince Mohammad Bin Fahd Univ, Dept Comp Engn, Khobar 31952, Saudi Arabia
来源
ELECTRONIC RESEARCH ARCHIVE | 2024年 / 32卷 / 07期
关键词
image-based data; spatial and frequency domain; malware identification; machine learning classifiers; feature extraction; feature hybridization; FRAMEWORK;
D O I
10.3934/era.2024192
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
The ubiquitous adoption of Android devices has unfortunately brought a surge in malware threats, compromising user data, privacy concerns, and financial and device integrity, to name a few. To combat this, numerous efforts have explored automated botnet detection mechanisms, with anomalybased approaches leveraging machine learning (ML) gaining attraction due to their signature-agnostic nature. However, the problem lies in devising accurate ML models which capture the ever evolving landscape of malwares by effectively leveraging all the possible features from Android application packages (APKs).This paper delved into this domain by proposing, implementing, and evaluating an imagebased Android malware detection (AMD) framework that harnessed the power of feature hybridization. The core idea of this framework was the conversion of text-based data extracted from Android APKs into grayscale images. The novelty aspect of this work lied in the unique image feature extraction strategies and their subsequent hybridization to achieve accurate malware classification using ML models. More specifically, four distinct feature extraction methodologies, namely, Texture and histogram of oriented gradients (HOG) from spatial domain, and discrete wavelet transform (DWT) and Gabor from the frequency domain were employed to hybridize the features for improved malware identification. To this end, three image-based datasets, namely, Dex, Manifest, and Composite, derived from the information security centre of excellence (ISCX) Android Malware dataset, were leveraged to evaluate the optimal data source for botnet classification. Popular ML classifiers, including naive Bayes (NB), multilayer perceptron (MLP), support vector machine (SVM), and random forest (RF), were employed for the classification task. The experimental results demonstrated the efficacy of the proposed framework, achieving a peak classification accuracy of 93.03% and recall of 97.1% for the RF classifier using the Manifest dataset and a combination of Texture and HOG features. These findings validate the proof-of-concept and provide valuable insights for researchers exploring ML/deep learning (DL) approaches in the domain of AMD.
引用
收藏
页码:4255 / 4290
页数:36
相关论文
共 50 条
  • [41] Clean-label Backdoor Attack on Machine Learning-based Malware Detection Models and Countermeasures
    Zheng, Wanjia
    Omote, Kazumasa
    2022 IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, 2022, : 1235 - 1242
  • [42] SecureDroid: Enhancing Security of Machine Learning-based Detection against Adversarial Android Malware Attacks
    Chen, Lingwei
    Hou, Shifu
    Ye, Yanfang
    33RD ANNUAL COMPUTER SECURITY APPLICATIONS CONFERENCE (ACSAC 2017), 2017, : 362 - 372
  • [43] MalwD&C: A Quick and Accurate Machine Learning-Based Approach for Malware Detection and Categorization
    Buriro, Attaullah
    Buriro, Abdul Baseer
    Ahmad, Tahir
    Buriro, Saifullah
    Ullah, Subhan
    APPLIED SCIENCES-BASEL, 2023, 13 (04):
  • [44] Minimized feature overhead malware detection machine learning model employing MRMR-based ranking
    Singh, Priyanka
    Borgohain, Samir Kumar
    Sharma, Lakhan Dev
    Kumar, Jayendra
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (17):
  • [45] Machine Learning-Based Human Stress Detection Model Employing Physiological Sensory Data
    Selvam, Subathra Panneer
    Subramani, Malarvizhi
    Jiavana, Ferents Koni
    Ramachandran, Arul Saravanan
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2025,
  • [46] Automated machine learning for deep learning based malware detection
    Brown, Austin
    Gupta, Maanak
    Abdelsalam, Mahmoud
    COMPUTERS & SECURITY, 2024, 137
  • [47] STEGANALYSIS OF LSB BASED IMAGE STEGANOGRAPHY USING SPATIAL AND FREQUENCY DOMAIN FEATURES
    Malekmohamadi, Hossein
    Ghaemmaghami, Shahrokh
    ICME: 2009 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-3, 2009, : 1740 - 1743
  • [48] Advanced Machine Learning Based Malware Detection Systems
    Kim, Song-Kyoo
    Feng, Xiaomei
    Al Hamadi, Hussam
    Damiani, Ernesto
    Yeun, Chan Yeob
    Nandyala, Sivaprasad
    IEEE ACCESS, 2024, 12 : 115296 - 115305
  • [49] Combined frequency and spatial domain-based patch propagation for image completion
    Florinabel, D. Jemi
    Juliet, S. Ebenezer
    Sadasivam, V.
    COMPUTERS & GRAPHICS-UK, 2011, 35 (06): : 1051 - 1062
  • [50] Machine Learning Based Improved Malware Detection Schemes
    Priyadarshan, Pradosh
    Sarangi, Prateek
    Ratht, Adyasha
    Rath, Adyasha
    Panda, Ganapati
    2021 11TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, DATA SCIENCE & ENGINEERING (CONFLUENCE 2021), 2021, : 925 - 931