Privacy Risks from Genomic Data-Sharing Beacons

被引:127
|
作者
Shringarpure, Suyash S. [1 ]
Bustamante, Carlos D. [1 ]
机构
[1] Stanford Univ, Dept Genet, Stanford, CA 94305 USA
关键词
PREDICTION;
D O I
10.1016/j.ajhg.2015.09.010
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
The human genetics community needs robust protocols that enable secure sharing of genomic data from participants in genetic research. Beacons are web servers that answer allele-presence queries such as "Do you have a genome that has a specific nucleotide (e.g., A) at a specific genomic position (e.g., position 11,272 on chromosome 1)?"-with either "yes" or "no." Here, we show that individuals in a beacon are susceptible to re-identification even if the only data shared include presence or absence information about alleles in a beacon. Specifically, we propose a likelihood-ratio test of whether a given individual is present in a given genetic beacon. Our test is not dependent on allele frequencies and is the most powerful test for a specified false-positive rate. Through simulations, we showed that in a beacon with 1,000 individuals, re-identification is possible with just 5,000 queries. Relatives can also be identified in the beacon. Re-identification is possible even in the presence of sequencing errors and variant-calling differences. In a beacon constructed with 65 European individuals from the 1000 Genomes Project, we demonstrated that it is possible to detect membership in the beacon with just 250 SNPs. With just 1,000 SNP queries, we were able to detect the presence of an individual genome from the Personal Genome Project in an existing beacon. Our results show that beacons can disclose membership and implied phenotypic information about participants and do not protect privacy a priori. We discuss risk mitigation through policies and standards such as not allowing anonymous pings of genetic beacons and requiring minimum beacon sizes.
引用
收藏
页码:631 / 646
页数:16
相关论文
共 50 条
  • [1] Re-identification of individuals in genomic data-sharing beacons via allele inference
    von Thenen, Nora
    Ayday, Erman
    Cicek, A. Ercument
    [J]. BIOINFORMATICS, 2019, 35 (03) : 365 - 371
  • [2] Genomic Data-Sharing Practices
    Villanueva, Angela G.
    Cook-Deegan, Robert
    Robinson, Jill O.
    McGuire, Amy L.
    Majumder, Mary A.
    [J]. JOURNAL OF LAW MEDICINE & ETHICS, 2019, 47 (01): : 31 - 40
  • [3] Data-Sharing Economy: Value-Addition from Data meets Privacy
    Bagad, Piyush
    Mitra, Subrata
    Dhamnani, Sunny
    Sinha, Atanu R.
    Gautam, Raunak
    Khanna, Haresh
    [J]. WSDM '21: PROCEEDINGS OF THE 14TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2021, : 1105 - 1108
  • [4] Genomic data-sharing: what will be our legacy?
    Callier, Shawneequa
    Husain, Rajah
    Simpson, Rachel
    [J]. FRONTIERS IN GENETICS, 2014, 5
  • [5] Privacy and Information Avoidance: An Experiment on Data-Sharing Preferences
    Svirsky, Dan
    [J]. JOURNAL OF LEGAL STUDIES, 2022, 51 (01): : 63 - 92
  • [6] Federated discovery and sharing of genomic data using Beacons
    Fiume, Marc
    Cupak, Miroslav
    Keenan, Stephen
    Rambla, Jordi
    de la Torre, Sabela
    Dyke, Stephanie O. M.
    Brookes, Anthony J.
    Carey, Knox
    Lloyd, David
    Goodhand, Peter
    Haeussler, Maximilian
    Baudis, Michael
    Stockinger, Heinz
    Dolman, Lena
    Lappalainen, Ilkka
    Tornroos, Juha
    Linden, Mikael
    Spalding, J. Dylan
    Ur-Rehman, Saif
    Page, Angela
    Flicek, Paul
    Sherry, Stephen
    Haussler, David
    Varma, Susheel
    Saunders, Gary
    Scollen, Serena
    [J]. NATURE BIOTECHNOLOGY, 2019, 37 (03) : 220 - 224
  • [7] Federated discovery and sharing of genomic data using Beacons
    Marc Fiume
    Miroslav Cupak
    Stephen Keenan
    Jordi Rambla
    Sabela de la Torre
    Stephanie O. M. Dyke
    Anthony J. Brookes
    Knox Carey
    David Lloyd
    Peter Goodhand
    Maximilian Haeussler
    Michael Baudis
    Heinz Stockinger
    Lena Dolman
    Ilkka Lappalainen
    Juha Törnroos
    Mikael Linden
    J. Dylan Spalding
    Saif Ur-Rehman
    Angela Page
    Paul Flicek
    Stephen Sherry
    David Haussler
    Susheel Varma
    Gary Saunders
    Serena Scollen
    [J]. Nature Biotechnology, 2019, 37 : 220 - 224
  • [8] User-Centric Privacy Preservation in Data-Sharing Applications
    Gao, Feng
    He, Jingsha
    Peng, Shufen
    [J]. NETWORK AND PARALLEL COMPUTING, 2010, 6289 : 423 - +
  • [9] Privacy concerns could derail Facebook data-sharing plan
    Mervis, Jeffrey
    [J]. SCIENCE, 2019, 365 (6460) : 1360 - 1361
  • [10] Privacy and security data-sharing technologies for tampering protection and detection
    Buddhika Harshanath, S.M.
    [J]. IEIE Transactions on Smart Processing and Computing, 2019, 8 (01): : 43 - 48