Efficient privacy-preserving whole-genome variant queries

被引:4
|
作者
Akguen, Mete [1 ,2 ,3 ,4 ]
Pfeifer, Nico [2 ,5 ,6 ]
Kohlbacher, Oliver [2 ,3 ,7 ]
机构
[1] Univ Tubingen, Dept Comp Sci, Med Data Privacy & Privacy Preserving ML Healthca, Tubingen, Germany
[2] Univ Tubingen, Inst Bioinformat & Med Informat, Tubingen, Germany
[3] Univ Hosp Tubingen, Translat Bioinformat, Tubingen, Germany
[4] Izmir Inst Technol, Dept Comp Engn, Izmir, Turkey
[5] Univ Tubingen, Dept Comp Sci, Methods Med Informat, Tubingen, Germany
[6] Max Planck Inst Informat, Stat Learning Computat Biol, Saarbrucken, Germany
[7] Univ Tubingen, Appl Bioinformat, Dept Comp Sci, Tubingen, Germany
关键词
D O I
10.1093/bioinformatics/btac070
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Diagnosis and treatment decisions on genomic data have become widespread as the cost of genome sequencing decreases gradually. In this context, disease-gene association studies are of great importance. However, genomic data are very sensitive when compared to other data types and contains information about individuals and their relatives. Many studies have shown that this information can be obtained from the query-response pairs on genomic databases. In this work, we propose a method that uses secure multi-party computation to query genomic databases in a privacy-protected manner. The proposed solution privately outsources genomic data from arbitrarily many sources to the two non-colluding proxies and allows genomic databases to be safely stored in semi-honest cloud environments. It provides data privacy, query privacy and output privacy by using XOR-based sharing and unlike previous solutions, it allows queries to run efficiently on hundreds of thousands of genomic data. Results: We measure the performance of our solution with parameters similar to real-world applications. It is possible to query a genomic database with 3 000 000 variants with five genomic query predicates under 400 ms. Querying 1 048 576 genomes, each containing 1 000 000 variants, for the presence of five different query variants can be achieved approximately in 6 min with a small amount of dedicated hardware and connectivity. These execution times are in the right range to enable real-world applications in medical research and healthcare. Unlike previous studies, it is possible to query multiple databases with response times fast enough for practical application. To the best of our knowledge, this is the first solution that provides this performance for querying large-scale genomic data.
引用
收藏
页码:2202 / 2210
页数:9
相关论文
共 50 条
  • [31] EPPSQ: Achieving efficient and privacy-preserving statistics queries over encrypted data in smart grids
    Li, Beibei
    Zhu, Ziqing
    Zhang, Linghao
    Chang, Zhengwei
    Zhao, Liang
    Kumar, Arun
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2023, 149 : 265 - 279
  • [32] Achieving Privacy-Preserving Discrete Frechet Distance Range Queries
    Guan, Yunguo
    Lu, Rongxing
    Zheng, Yandong
    Zhang, Songnian
    Shao, Jun
    Wei, Guiyi
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2023, 20 (03) : 2097 - 2110
  • [33] Privacy-Preserving and Content-Protecting Location Based Queries
    Paulet, Russell
    Kaosar, Md. Golam
    Yi, Xun
    Bertino, Elisa
    2012 IEEE 28TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2012, : 44 - 53
  • [34] A Demonstration of Privacy-Preserving Aggregate Queries for Optimal Location Selection
    Eryonucu, Cihan
    Ayday, Erman
    Zeydan, Engin
    2018 IEEE 19TH INTERNATIONAL SYMPOSIUM ON A WORLD OF WIRELESS, MOBILE AND MULTIMEDIA NETWORKS (WOWMOM), 2018,
  • [35] Efficient privacy-preserving variable-length substring match for genome sequence
    Yoshiki Nakagawa
    Satsuya Ohata
    Kana Shimizu
    Algorithms for Molecular Biology, 17
  • [36] Efficient and privacy-preserving multi-party skyline queries in online medical primary diagnosis
    Hao, Wanjun
    Liu, Shuqin
    Lv, Chunyang
    Wang, Yunling
    Wang, Jianfeng
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2023, 35 (08)
  • [37] Privacy-Preserving Computation and Verification of Aggregate Queries on Outsourced Databases
    Thompson, Brian
    Haber, Stuart
    Horne, William G.
    Sander, Tomas
    Yao, Danfeng
    PRIVACY ENHANCING TECHNOLOGIES, PROCEEDINGS, 2009, 5672 : 185 - +
  • [38] PPsky: Privacy-Preserving Skyline Queries with Secret Sharing in eHealthcare
    Zhang, Songnian
    Ray, Suprio
    Lu, Rongxing
    Guan, Yunguo
    2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 5469 - 5474
  • [39] Efficient privacy-preserving variable-length substring match for genome sequence
    Nakagawa, Yoshiki
    Ohata, Satsuya
    Shimizu, Kana
    ALGORITHMS FOR MOLECULAR BIOLOGY, 2022, 17 (01)
  • [40] Privacy-Preserving and Content-Protecting Location Based Queries
    Paulet, Russell
    Kaosar, Md. Golam
    Yi, Xun
    Bertino, Elisa
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (05) : 1200 - 1210