Evaluating Variational Autoencoder as a Private Data Release Mechanism for Tabular Data

被引:10
|
作者
Li, Szu-Chuang [1 ]
Tai, Bo-Chen [1 ]
Huang, Yennun [1 ]
机构
[1] Acad Sinica, Res Ctr Informat Technol Innovat, Taipei, Taiwan
关键词
variational autoencoder; private data release; k-anonymity; k-Level;
D O I
10.1109/PRDC47002.2019.00050
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Multi-market businesses can collect data from different business entities and aggregate data from various sources to create value. However, due to the restriction of privacy regulation, it could be illegal to exchange data between business entities of the same parent company, unless the users have opted-in to allow it. Regulations such as the EU's GDPR allows data exchange if data is anonymized appropriately. In this study, we use variational autoencoder as a mechanism to generate synthetic data. The privacy and utility of the generated data sets are measured. And its performance is compared with the performance of the plain autoencoder. The primary findings of this study are 1) variational autoencoder can be an option for data exchange with good accuracy even when the number of latent dimensions is low 2) plain autoencoder still provides better accuracy when the number of hidden nodes is high 3) variational autoencoder, as a generative model, can be given to a data user to generate his version of data that closely mimic the original data set.
引用
收藏
页码:198 / 206
页数:9
相关论文
共 50 条
  • [41] Iterative Constructions and Private Data Release
    Gupta, Anupam
    Roth, Aaron
    Ullman, Jonathan
    THEORY OF CRYPTOGRAPHY (TCC 2012), 2012, 7194 : 339 - 356
  • [42] Private Graph Data Release: A Survey
    Li, Yang
    Purcell, Michael
    Rakotoarivelo, Thierry
    Smith, David
    Ranbaduge, Thilina
    Ng, Kee Siong
    ACM COMPUTING SURVEYS, 2023, 55 (11)
  • [43] A synthetic data generation system based on the variational-autoencoder technique and the linked data paradigm
    Dos Santos, Ricardo
    Aguilar, Jose
    PROGRESS IN ARTIFICIAL INTELLIGENCE, 2024, 13 (02) : 149 - 163
  • [44] Explainable hybrid tabular Variational Autoencoder and feature Tokenizer Transformer for depression prediction
    Quang Tran, Vinh
    Byeon, Haewon
    Expert Systems with Applications, 2025, 265
  • [45] Variational Hierarchical Dialog Autoencoder for Dialog State Tracking Data Augmentation
    Yoo, Kang Min
    Lee, Hanbit
    Dernoncourt, Franck
    Bui, Trung
    Chang, Walter
    Lee, Sang-Goo
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 3406 - 3425
  • [46] Handling missing data in variational autoencoder based item response theory
    Veldkamp, Karel
    Grasman, Raoul
    Molenaar, Dylan
    BRITISH JOURNAL OF MATHEMATICAL & STATISTICAL PSYCHOLOGY, 2024,
  • [47] Unsupervised feature learning for electrocardiogram data using the convolutional variational autoencoder
    Jang, Jong-Hwan
    Kim, Tae Young
    Lim, Hong-Seok
    Yoon, Dukyong
    PLOS ONE, 2021, 16 (12):
  • [48] Variational Autoencoder for Anomaly Detection in Event Data in Online Process Mining
    Krajsic, Philippe
    Franczyk, Bogdan
    PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS (ICEIS 2021), VOL 1, 2021, : 567 - 574
  • [49] Data Augmentation and Feature Extraction using Variational Autoencoder for Acoustic Modeling
    Nishizaki, Hiromitsu
    2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC 2017), 2017, : 1263 - 1268
  • [50] 3D-Var data assimilation using a variational autoencoder
    Melinc, Bostjan
    Zaplotnik, Ziga
    QUARTERLY JOURNAL OF THE ROYAL METEOROLOGICAL SOCIETY, 2024, 150 (761) : 2273 - 2295