Securing Aggregate Queries for DNA Databases

被引:8
|
作者
Nassar, Mohamed [1 ,2 ]
Malluhi, Qutaibah [1 ,2 ]
Atallah, Mikhail [3 ]
Shikfa, Abdullatif [1 ,2 ]
机构
[1] Qatar Univ, Dept Comp Sci & Engn, Doha, Qatar
[2] Qatar Univ, KINDI Ctr Comp Res, Doha, Qatar
[3] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
基金
美国国家科学基金会;
关键词
DNA databases; cloud security; secure outsourcing; DATA PRIVACY;
D O I
10.1109/TCC.2017.2682860
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper addresses the problem of sharing person-specific genomic sequences without violating the privacy of their data subjects to support large-scale biomedical research projects. The proposed method builds on the framework proposed by Kantarcioglu et al. [1] but extends the results in a number of ways. One improvement is that our scheme is deterministic, with zero probability of a wrong answer (as opposed to a low probability). We also provide a new operating point in the space-time tradeoff, by offering a scheme that is twice as fast as theirs but uses twice the storage space. This point is motivated by the fact that storage is cheaper than computation in current cloud computing pricing plans. Moreover, our encoding of the data makes it possible for us to handle a richer set of queries than exact matching between the query and each sequence of the database, including: (i) counting the number of matches between the query symbols and a sequence; (ii) logical OR matches where a query symbol is allowed to match a subset of the alphabet thereby making it possible to handle (as a special case) a "not equal to" requirement for a query symbol (e.g., "not a G"); (iii) support for the extended alphabet of nucleotide base codes that encompasses ambiguities in DNA sequences (this happens on the DNA sequence side instead of the query side); (iv) queries that specify the number of occurrences of each kind of symbol in the specified sequence positions (e.g., two 'A' and four 'C' and one 'G' and three 'T', occurring in any order in the query-specified sequence positions); (v) a threshold query whose answer is 'yes' if the number of matches exceeds a query-specified threshold (e.g., "7 or more matches out of the 15 query-specified positions"). (vi) For all query types, we can hide the answers from the decrypting server, so that only the client learns the answer. (vii) In all cases, the client deterministically learns only the query's answer, except for query type (v) where we quantify the (very small) statistical leakage to the client of the actual count.
引用
收藏
页码:827 / 837
页数:11
相关论文
共 50 条
  • [1] Aggregate Queries on Sparse Databases
    Torunczyk, Szymon
    [J]. PODS'20: PROCEEDINGS OF THE 39TH ACM SIGMOD-SIGACT-SIGAI SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, 2020, : 427 - 443
  • [2] Aggregate nearest neighbor queries in spatial databases
    Papadias, D
    Tao, YF
    Mouratidis, K
    Hui, CK
    [J]. ACM TRANSACTIONS ON DATABASE SYSTEMS, 2005, 30 (02): : 529 - 576
  • [3] Grouping and aggregate queries over semantic web Databases
    Seid, Dawit
    Mehrotra, Sharad
    [J]. ICSC 2007: INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, PROCEEDINGS, 2007, : 775 - +
  • [4] Answering joint queries from multiple aggregate OLAP databases
    Pourabbas, E
    Shoshani, A
    [J]. DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2003, 2737 : 24 - 34
  • [5] Aggregate Count Queries in Probabilistic Spatio-temporal Databases
    Grant, John
    Molinaro, Cristian
    Parisi, Francesco
    [J]. SCALABLE UNCERTAINTY MANAGEMENT, SUM 2013, 2013, 8078 : 255 - 268
  • [6] Optimized Processing of a Batch of Aggregate Queries over Hidden Databases
    Rezk, Eman
    Aqe, Aboubakr
    Jaoua, Ali
    Das, Gautam
    Zhang, Nan
    [J]. 2017 INTERNATIONAL CONFERENCE ON COMPUTER AND APPLICATIONS (ICCA), 2017, : 317 - 324
  • [7] Getting qualified answers for aggregate queries in spatio-temporal databases
    Jin, Cheqing
    Guo, Weibin
    Zhao, Futong
    [J]. ADVANCES IN DATA AND WEB MANAGEMENT, PROCEEDINGS, 2007, 4505 : 220 - +
  • [8] Privacy-Preserving Computation and Verification of Aggregate Queries on Outsourced Databases
    Thompson, Brian
    Haber, Stuart
    Horne, William G.
    Sander, Tomas
    Yao, Danfeng
    [J]. PRIVACY ENHANCING TECHNOLOGIES, PROCEEDINGS, 2009, 5672 : 185 - +
  • [9] Fixed-precision approximate continuous aggregate queries in peer-to-peer databases
    Banaei-Kashani, Farnoush
    Shahabi, Cyrus
    [J]. 2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2008, : 1427 - +
  • [10] Computing Aggregate Queries in Raster Image Databases Using Pre-Aggregated Data
    Gutierrez, Angelica Garcia
    Baumann, Peter
    [J]. WCECS 2008: WORLD CONGRESS ON ENGINEERING AND COMPUTER SCIENCE, 2008, : 201 - 206