Privacy-Preserving Synthetic Educational Data Generation

被引:2
|
作者
Vie, Jill-Jenn [1 ]
Rigaux, Tomas [1 ]
Minn, Sein [2 ]
机构
[1] Inria Saclay, SODA, 1 Rue Honore dEstienne dOrves, F-91120 Palaiseau, France
[2] Inria Saclay, CEDAR, 1 Rue Honore dEstienne dOrves, F-91120 Palaiseau, France
关键词
Generative models; Privacy; Item response theory;
D O I
10.1007/978-3-031-16290-9_29
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Institutions collect massive learning traces but they may not disclose it for privacy issues. Synthetic data generation opens new opportunities for research in education. In this paper we present a generative model for educational data that can preserve the privacy of participants, and an evaluation framework for comparing synthetic data generators. We show how naive pseudonymization can lead to re-identification threats and suggest techniques to guarantee privacy. We evaluate our method on existing massive educational open datasets.
引用
收藏
页码:393 / 406
页数:14
相关论文
共 50 条
  • [1] Privacy-Preserving Synthetic Data Generation for Recommendation Systems
    Liu, Fan
    Cheng, Zhiyong
    Chen, Huilin
    Wei, Yinwei
    Nie, Liqiang
    Kankanhalli, Mohan
    [J]. PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 1379 - 1389
  • [2] Privacy-Preserving Synthetic Smart Meters Data
    Del Grosso, Ganesh
    Pichler, Georg
    Piantanida, Pablo
    [J]. 2021 IEEE POWER & ENERGY SOCIETY INNOVATIVE SMART GRID TECHNOLOGIES CONFERENCE (ISGT), 2021,
  • [3] Privacy-Preserving Synthetic Location Data in the Real World
    Cunningham, Teddy
    Cormode, Graham
    Ferhatosmanoglu, Hakan
    [J]. PROCEEDINGS OF 17TH INTERNATIONAL SYMPOSIUM ON SPATIAL AND TEMPORAL DATABASES, SSTD 2021, 2021, : 23 - 33
  • [4] Evaluation of Synthetic Data for Privacy-Preserving Machine Learning
    Hittmeir, Markus
    Ekelhart, Andreas
    Mayer, Rudolf
    [J]. ERCIM NEWS, 2020, (123): : 30 - 31
  • [5] Privacy-Preserving Anomaly Detection Using Synthetic Data
    Mayer, Rudolf
    Hittmeir, Markus
    Ekelhart, Andreas
    [J]. DATA AND APPLICATIONS SECURITY AND PRIVACY XXXIV, DBSEC 2020, 2020, 12122 : 195 - 207
  • [6] Privacy-preserving data releases for health report generation
    Boyens, C
    Krishnan, R
    Padman, R
    [J]. MEDINFO 2004: PROCEEDINGS OF THE 11TH WORLD CONGRESS ON MEDICAL INFORMATICS, PT 1 AND 2, 2004, 107 : 1268 - 1272
  • [7] Assessment of Creditworthiness Models Privacy-Preserving Training with Synthetic Data
    Munoz-Cancino, Ricardo
    Bravo, Cristian
    Rios, Sebastian A.
    Grana, Manuel
    [J]. HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, HAIS 2022, 2022, 13469 : 375 - 384
  • [8] Privacy-preserving generation and publication of synthetic trajectory microdata: A comprehensive survey
    Kim, Jong Wook
    Jang, Beakcheol
    [J]. JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2024, 230
  • [9] DataSynthesizer: Privacy-Preserving Synthetic Datasets
    Ping, Haoyue
    Stoyanovich, Julia
    Howe, Bill
    [J]. SSDBM 2017: 29TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, 2017,
  • [10] Privacy-Preserving Data Publishing
    Liu, Ruilin
    Wang, Hui
    [J]. 2010 IEEE 26TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS (ICDE 2010), 2010, : 305 - 308