An ensemble approach for imbalanced multiclass malware classification using 1D-CNN

被引:0
|
作者
Panda B. [1 ]
Bisoyi S.S. [2 ]
Panigrahy S. [3 ]
机构
[1] Department of Computer Science and Engineering, Institute of Technical Education and Research, Siksha ’O’ Anusandhan (Deemed to be) University, Odisha, Bhubaneswar
[2] Department of Computer Science and Information Technology, Institute of Technical Education and Research, Siksha ‘O’ Anusandhan (Deemed to be) University, Odisha, Bhubaneswar
[3] Haas School of Business, University of California, Berkeley, Berkeley, CA
关键词
1D-CNN; API sequence; Dynamic analysis; Ensemble learning; Malware classification; Skip-gram;
D O I
10.7717/PEERJ-CS.1677
中图分类号
学科分类号
摘要
Dependence on the internet and computer programs demonstrates the significance of computer programs in our day-to-day lives. Such demands motivate malware developers to create more malware, both in terms of quantity and variety. Researchers are constantly faced with hurdles while attempting to protect themselves from potential hazards and risks due to malware authors’ usage of code obfuscation techniques. Metamorphic and polymorphic variations are easily able to elude the widely utilized signature-based detection procedures. Researchers are more interested in deep learning approaches than machine learning techniques to analyze the behavior of such a vast number of virus variants. Researchers have been drawn to the categorization of malware within itself in addition to the classification of malware against benign programs to examine the behavioral differences between them. In order to investigate the relationship between the application programming interface (API) calls throughout API sequences and classify them, this work uses the one-dimensional convolutional neural network (1D-CNN) model to solve a multiclass classification problem. On API sequences, feature vectors for distinctive APIs are created using the Word2Vec word embedding approach and the skip-gram model. The one-vs.-rest approach is used to train 1D-CNN models to categorize malware, and all of them are then combined with a suggested ModifiedSoftVoting algorithm to improve classification. On the open benchmark dataset Mal-API-2019, the suggested ensembled 1D-CNN architecture captures improved evaluation scores with an accuracy of 0.90, a weighted average F1-score of 0.90, and an AUC score of more than 0.96 for all classes of malware. Subjects Data Mining and Machine Learning, Security and Privacy, Neural Networks © 2023 Panda et al. Distributed under Creative Commons CC-BY 4.0. All Rights Reserved.
引用
收藏
相关论文
共 50 条
  • [1] An ensemble approach for imbalanced multiclass malware classification using 1D-CNN
    Panda, Binayak
    Bisoyi, Sudhanshu Shekhar
    Panigrahy, Sidhanta
    PEERJ COMPUTER SCIENCE, 2023, 9
  • [2] An ensemble of pre-trained transformer models for imbalanced multiclass malware classification
    Demirkiran, Ferhat
    Cayir, Aykut
    Unal, Gur
    Dag, Hasan
    COMPUTERS & SECURITY, 2022, 121
  • [3] Classification of Malware Families Based on Efficient-Net and 1D-CNN Fusion
    Chong, Xulei
    Gao, Yating
    Zhang, Ru
    Liu, Jianyi
    Huang, Xingjie
    Zhao, Jinmeng
    ELECTRONICS, 2022, 11 (19)
  • [4] Attention based 1D-CNN for Mental Workload Classification using EEG
    Parveen, Fiza
    Bhavsar, Arnav
    PROCEEDINGS OF THE 16TH ACM INTERNATIONAL CONFERENCE ON PERVASIVE TECHNOLOGIES RELATED TO ASSISTIVE ENVIRONMENTS, PETRA 2023, 2023, : 739 - 745
  • [5] Classification of Subjects With Balance Disorders Using 1D-CNN and Inertial Sensors
    Napieralski, Jan Andrzej
    Tylman, Wojciech
    Kotas, Rafal
    Marciniak, Pawel
    Kaminski, Marek
    Janc, Magdalena
    Jozefowicz-Korczynska, Magdalena
    Zamyslowska-Szmytke, Ewa
    IEEE ACCESS, 2022, 10 : 127610 - 127619
  • [6] Using Multi-features and Ensemble Learning Method for Imbalanced Malware Classification
    Zhang, Yunan
    Huang, Qingjia
    Ma, Xinjian
    Yang, Zeming
    Jiang, Jianguo
    2016 IEEE TRUSTCOM/BIGDATASE/ISPA, 2016, : 965 - 973
  • [7] Classification of Human Activities Based on Radar Signals Using 1D-CNN and LSTM
    Zhu, Jianping
    Chen, Haiquan
    Ye, Wenbin
    2020 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2020,
  • [8] Artifact Removal using Elliptic Filter and Classification using 1D-CNN for EEG signals
    Nagabushanam, P.
    George, S. Thomas
    Davu, Praharsha
    Bincy, P.
    Naidu, Meghana
    Radha, S.
    2020 6TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATION SYSTEMS (ICACCS), 2020, : 551 - 556
  • [9] Foot type classification using sensor-enabled footwear and 1D-CNN
    Mei, Zhanyong
    Ivanov, Kamen
    Zhao, Guoru
    Wu, Yuanyuan
    Liu, Mingzhe
    Wang, Lei
    MEASUREMENT, 2020, 165
  • [10] EEG stress classification based on Doppler spectral features for ensemble 1D-CNN with LCL activation function
    Naren, J.
    Babu, A. Ramesh
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2024, 36 (04)