Evaluating Variational Autoencoder as a Private Data Release Mechanism for Tabular Data

被引:10
|
作者
Li, Szu-Chuang [1 ]
Tai, Bo-Chen [1 ]
Huang, Yennun [1 ]
机构
[1] Acad Sinica, Res Ctr Informat Technol Innovat, Taipei, Taiwan
关键词
variational autoencoder; private data release; k-anonymity; k-Level;
D O I
10.1109/PRDC47002.2019.00050
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Multi-market businesses can collect data from different business entities and aggregate data from various sources to create value. However, due to the restriction of privacy regulation, it could be illegal to exchange data between business entities of the same parent company, unless the users have opted-in to allow it. Regulations such as the EU's GDPR allows data exchange if data is anonymized appropriately. In this study, we use variational autoencoder as a mechanism to generate synthetic data. The privacy and utility of the generated data sets are measured. And its performance is compared with the performance of the plain autoencoder. The primary findings of this study are 1) variational autoencoder can be an option for data exchange with good accuracy even when the number of latent dimensions is low 2) plain autoencoder still provides better accuracy when the number of hidden nodes is high 3) variational autoencoder, as a generative model, can be given to a data user to generate his version of data that closely mimic the original data set.
引用
收藏
页码:198 / 206
页数:9
相关论文
共 50 条
  • [1] CTVAE: Contrastive Tabular Variational Autoencoder for imbalance data
    Wang, Alex X.
    Le, Minh Quang
    Duong, Huu-Thanh
    Van, Bay Nguyen
    Nguyen, Binh P.
    KNOWLEDGE AND INFORMATION SYSTEMS, 2025,
  • [2] Interpretation for Variational Autoencoder Used to Generate Financial Synthetic Tabular Data
    Wu, Jinhong
    Plataniotis, Konstantinos
    Liu, Lucy
    Amjadian, Ehsan
    Lawryshyn, Yuri
    ALGORITHMS, 2023, 16 (02)
  • [3] DPTVAE: Data-driven prior-based tabular variational autoencoder for credit data synthesizing
    Tan, Yandan
    Zhu, Hongbin
    Wu, Jie
    Chai, Hongfeng
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 241
  • [4] Spatiotemporal subspace variational autoencoder with repair mechanism for traffic data imputation
    Qian, Jialong
    Zhang, Shiqi
    Pian, Yuzhuang
    Chen, Xinyi
    Liu, Yonghong
    NEUROCOMPUTING, 2025, 617
  • [5] Variational AutoEncoder for synthetic insurance data
    Jamotton, Charlotte
    Hainaut, Donatien
    INTELLIGENT SYSTEMS WITH APPLICATIONS, 2024, 24
  • [6] Variational Relevance Vector Machine for Tabular Data
    Kropotov, Dmitry
    Vetrov, Dmitry
    Wolf, Lior
    Hassner, Tal
    PROCEEDINGS OF 2ND ASIAN CONFERENCE ON MACHINE LEARNING (ACML2010), 2010, 13 : 79 - 94
  • [7] Deterministic Autoencoder using Wasserstein loss for tabular data generation
    Wang, Alex X.
    Nguyen, Binh P.
    NEURAL NETWORKS, 2025, 185
  • [8] Variational AutoEncoder to Identify Anomalous Data in Robots
    Pangione, Luigi
    Burroughes, Guy
    Skilton, Robert
    ROBOTICS, 2021, 10 (03)
  • [9] A Variational Autoencoder for Heterogeneous Temporal and Longitudinal Data
    Ogretir, Mine
    Ramchandran, Siddharth
    Papatheodorou, Dimitrios
    Lahdesmaki, Harri
    2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 1522 - 1529
  • [10] Crash data augmentation using variational autoencoder
    Islam, Zubayer
    Abdel-Aty, Mohamed
    Cai, Qing
    Yuan, Jinghui
    ACCIDENT ANALYSIS AND PREVENTION, 2021, 151