Data Augmentation with Generative Models for Improved Malware Detection: A Comparative Study

被引：0

作者：

Burks, Roland, III ^{[1
]}

Islam, Kazi Aminul ^{[2
]}

Li, Jiang ^{[2
]}

Lu, Yan ^{[3
]}

机构：

[1] Samford Univ, Dept MCS, Birmingham, AL 35229 USA

[2] Old Dominion Univ, Dept ECE, Norfolk, VA USA

[3] Old Dominion Univ, Dept CMSE, Norfolk, VA USA

来源：

2019 IEEE 10TH ANNUAL UBIQUITOUS COMPUTING, ELECTRONICS & MOBILE COMMUNICATION CONFERENCE (UEMCON) | 2019年

关键词：

Variational Autoencoders; Generative Adversarial Networks; Deep Residual Networks; Deep Learning; CNN;

D O I：

10.1109/uemcon47517.2019.8993085

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Generative Models have been very accommodating when it comes to generating artificial data. Two of the most popular and promising models are the Generative Adversarial Network (GAN) and Variational Autoencoder (VAE) models. They both play critical roles in classification problems by generating synthetic data to train classifier more accurately. Malware detection is the process of determining whether or not software is malicious on the host's system and diagnosing what type of attack it is. Without adequate amount of training data, it makes malware detection less efficient. In this paper, we compare the two generative models to generate synthetic training data to boost the Residual Network (ResNet-18) classifier for malware detection. Experiment results show that adding synthetic malware samples generated by VAE to the training data improved the accuracy of ResNet-18 by 2% as it compared to 6% by GAN.

引用

页码：660 / 665

页数：6

共 50 条

[31] Data Augmentation Using Deep Generative Models for Embedding Based Speaker Recognition
Wang, Shuai
Yang, Yexin
Wu, Zhanghao
Qian, Yanmin
Yu, Kai
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 2598 - 2609
[32] Auto-encoder-based generative models for data augmentation on regression problems
Ohno, Hiroshi
SOFT COMPUTING, 2020, 24 (11) : 7999 - 8009
[33] Auto-encoder-based generative models for data augmentation on regression problems
Hiroshi Ohno
Soft Computing, 2020, 24 : 7999 - 8009
[34] Training data augmentation using generative models with statistical guarantees for materials informatics
Ohno, Hiroshi
SOFT COMPUTING, 2022, 26 (03) : 1181 - 1196
[35] Training data augmentation using generative models with statistical guarantees for materials informatics
Hiroshi Ohno
Soft Computing, 2022, 26 : 1181 - 1196
[36] Generative Data Augmentation for Commonsense Reasoning
Yang, Yiben
Malaviya, Chaitanya
Fernandez, Jared
Swayamdipta, Swabha
Le Bras, Ronan
Wang, Ji-Ping
Bhagavatula, Chandra
Choi, Yejin
Downe, Doug
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 1008 - 1025
[37] A comprehensive survey for generative data augmentation
Chen, Yunhao
Yan, Zihui
Zhu, Yunjie
NEUROCOMPUTING, 2024, 600
[38] Toward Understanding Generative Data Augmentation
Zheng, Chenyu
Wu, Guoqiang
Li, Chongxuan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[39] Generative Data Augmentation of Human Biomechanics
Karason, Halldor
Ritrovato, Pierluigi
Maffulli, Nicola
Tortorella, Francesco
IMAGE ANALYSIS AND PROCESSING - ICIAP 2023 WORKSHOPS, PT I, 2024, 14365 : 482 - 493
[40] Generative AI-Driven Data Augmentation for Crack Detection in Physical Structures
Kim, Jinwook
Seon, Joonho
Kim, Soohyun
Sun, Youngghyu
Lee, Seongwoo
Kim, Jeongho
Hwang, Byungsun
Kim, Jinyoung
ELECTRONICS, 2024, 13 (19)

← 1 2 3 4 5 →