Generative Adversarial Networks for Increasing the Veracity of Big Data

被引:0
|
作者
Dering, Matthew L. [1 ]
Tucker, Conrad S. [2 ]
机构
[1] Penn State Univ, Comp Sci & Engn, University Pk, PA 16802 USA
[2] Penn State Univ, Engn Design & Ind Engn, University Pk, PA 16802 USA
基金
美国国家科学基金会;
关键词
Generative Models; Big Data; Deep Learning; GANs; Sketches;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work describes how automated data generation integrates in a big data pipeline. A lack of veracity in big data can cause models that are inaccurate, or biased by trends in the training data. This can lead to issues as a pipeline matures that are difficult to overcome. This work describes the use of a Generative Adversarial Network to generate sketch data, such as those that might be used in a human verification task. These generated sketches are verified as recognizable using a crowd-sourcing methodology, and finds that the generated sketches were correctly recognized 43.8% of the time, in contrast to human drawn sketches which were 87.7% accurate. This method is scalable and can be used to generate realistic data in many domains and bootstrap a dataset used for training a model prior to deployment.
引用
收藏
页码:2595 / 2602
页数:8
相关论文
共 50 条
  • [1] Generative adversarial networks for imputing missing data for big data clinical research
    Weinan Dong
    Daniel Yee Tak Fong
    Jin-sun Yoon
    Eric Yuk Fai Wan
    Laura Elizabeth Bedford
    Eric Ho Man Tang
    Cindy Lo Kuen Lam
    [J]. BMC Medical Research Methodology, 21
  • [2] Generative adversarial networks for imputing missing data for big data clinical research
    Dong, Weinan
    Fong, Daniel Yee Tak
    Yoon, Jin-sun
    Wan, Eric Yuk Fai
    Bedford, Laura Elizabeth
    Tang, Eric Ho Man
    Lam, Cindy Lo Kuen
    [J]. BMC MEDICAL RESEARCH METHODOLOGY, 2021, 21 (01)
  • [3] On the Variety and Veracity of Cyber Intrusion Alerts Synthesized by Generative Adversarial Networks
    Sweet, Christopher
    Moskal, Stephen
    Yang, Shanchieh Jay
    [J]. ACM TRANSACTIONS ON MANAGEMENT INFORMATION SYSTEMS, 2020, 11 (04)
  • [4] Research on Denoising Technology of Generative Adversarial Networks (GAN) Based on Big Data
    Feng Xiancheng
    Zhang Xinyu
    Qiu Wu
    [J]. MIPPR 2019: PARALLEL PROCESSING OF IMAGES AND OPTIMIZATION TECHNIQUES; AND MEDICAL IMAGING, 2020, 11431
  • [5] Conditional Generative Adversarial Networks with Adversarial Attack and Defense for Generative Data Augmentation
    Baek, Francis
    Kim, Daeho
    Park, Somin
    Kim, Hyoungkwan
    Lee, SangHyun
    [J]. JOURNAL OF COMPUTING IN CIVIL ENGINEERING, 2022, 36 (03)
  • [6] Generative Adversarial Networks for Bitcoin Data Augmentation
    Zola, Francesco
    Lukas Bruse, Jan
    Etxeberria Barrio, Xabier
    Galar, Mikel
    Orduna Urrutia, Raul
    [J]. 2020 2ND CONFERENCE ON BLOCKCHAIN RESEARCH & APPLICATIONS FOR INNOVATIVE NETWORKS AND SERVICES (BRAINS), 2020, : 136 - 143
  • [7] Data Synthesis based on Generative Adversarial Networks
    Park, Noseong
    Mohammadi, Mahmoud
    Gorde, Kshitij
    Jajodia, Sushil
    Park, Hongkyu
    Kim, Youngmin
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2018, 11 (10): : 1071 - 1083
  • [8] Data Augmentation with Improved Generative Adversarial Networks
    Shi, Hongjiang
    Wang, Lu
    Ding, Guangtai
    Yang, Fenglei
    Li, Xiaoqiang
    [J]. 2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 73 - 78
  • [9] Augmenting data with generative adversarial networks: An overview
    Ljubic, Hrvoje
    Martinovic, Goran
    Volaric, Tomislav
    [J]. INTELLIGENT DATA ANALYSIS, 2022, 26 (02) : 361 - 378
  • [10] Training Generative Adversarial Networks with Limited Data
    Karras, Tero
    Aittala, Miika
    Hellsten, Janne
    Laine, Samuli
    Lehtinen, Jaakko
    Aila, Timo
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33