Data Augmentation with Generative Models for Improved Malware Detection: A Comparative Study

被引:0
|
作者
Burks, Roland, III [1 ]
Islam, Kazi Aminul [2 ]
Li, Jiang [2 ]
Lu, Yan [3 ]
机构
[1] Samford Univ, Dept MCS, Birmingham, AL 35229 USA
[2] Old Dominion Univ, Dept ECE, Norfolk, VA USA
[3] Old Dominion Univ, Dept CMSE, Norfolk, VA USA
来源
2019 IEEE 10TH ANNUAL UBIQUITOUS COMPUTING, ELECTRONICS & MOBILE COMMUNICATION CONFERENCE (UEMCON) | 2019年
关键词
Variational Autoencoders; Generative Adversarial Networks; Deep Residual Networks; Deep Learning; CNN;
D O I
10.1109/uemcon47517.2019.8993085
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Generative Models have been very accommodating when it comes to generating artificial data. Two of the most popular and promising models are the Generative Adversarial Network (GAN) and Variational Autoencoder (VAE) models. They both play critical roles in classification problems by generating synthetic data to train classifier more accurately. Malware detection is the process of determining whether or not software is malicious on the host's system and diagnosing what type of attack it is. Without adequate amount of training data, it makes malware detection less efficient. In this paper, we compare the two generative models to generate synthetic training data to boost the Residual Network (ResNet-18) classifier for malware detection. Experiment results show that adding synthetic malware samples generated by VAE to the training data improved the accuracy of ResNet-18 by 2% as it compared to 6% by GAN.
引用
收藏
页码:660 / 665
页数:6
相关论文
共 50 条
  • [31] Data Augmentation Using Deep Generative Models for Embedding Based Speaker Recognition
    Wang, Shuai
    Yang, Yexin
    Wu, Zhanghao
    Qian, Yanmin
    Yu, Kai
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 2598 - 2609
  • [32] Auto-encoder-based generative models for data augmentation on regression problems
    Ohno, Hiroshi
    SOFT COMPUTING, 2020, 24 (11) : 7999 - 8009
  • [33] Auto-encoder-based generative models for data augmentation on regression problems
    Hiroshi Ohno
    Soft Computing, 2020, 24 : 7999 - 8009
  • [34] Training data augmentation using generative models with statistical guarantees for materials informatics
    Ohno, Hiroshi
    SOFT COMPUTING, 2022, 26 (03) : 1181 - 1196
  • [35] Training data augmentation using generative models with statistical guarantees for materials informatics
    Hiroshi Ohno
    Soft Computing, 2022, 26 : 1181 - 1196
  • [36] Generative Data Augmentation for Commonsense Reasoning
    Yang, Yiben
    Malaviya, Chaitanya
    Fernandez, Jared
    Swayamdipta, Swabha
    Le Bras, Ronan
    Wang, Ji-Ping
    Bhagavatula, Chandra
    Choi, Yejin
    Downe, Doug
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 1008 - 1025
  • [37] A comprehensive survey for generative data augmentation
    Chen, Yunhao
    Yan, Zihui
    Zhu, Yunjie
    NEUROCOMPUTING, 2024, 600
  • [38] Toward Understanding Generative Data Augmentation
    Zheng, Chenyu
    Wu, Guoqiang
    Li, Chongxuan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [39] Generative Data Augmentation of Human Biomechanics
    Karason, Halldor
    Ritrovato, Pierluigi
    Maffulli, Nicola
    Tortorella, Francesco
    IMAGE ANALYSIS AND PROCESSING - ICIAP 2023 WORKSHOPS, PT I, 2024, 14365 : 482 - 493
  • [40] Generative AI-Driven Data Augmentation for Crack Detection in Physical Structures
    Kim, Jinwook
    Seon, Joonho
    Kim, Soohyun
    Sun, Youngghyu
    Lee, Seongwoo
    Kim, Jeongho
    Hwang, Byungsun
    Kim, Jinyoung
    ELECTRONICS, 2024, 13 (19)