Cryptographically Secure Private Record Linkage Using Locality-Sensitive Hashing

被引:0
|
作者
Wei, Ruidi [1 ]
Kerschbaum, Florian [1 ]
机构
[1] Univ Waterloo, Waterloo, ON, Canada
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2023年 / 17卷 / 02期
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
10.14778/3626292.3626293
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Private record linkage (PRL) is the problem of identifying pairs of records that approximately match across datasets in a secure, privacy-preserving manner. Two-party PRL specifically allows each of the parties to obtain records from the other party, only given that each record matches with one of their own. The privacy goal is that no other information about the datasets should be released than the matching records. A fundamental challenge is not to leak information while at the same time not comparing all pairs of records. In plaintext record linkage this is done using a blocking strategy, e.g., locality-sensitive hashing. One recent approach proposed by He et al. (ACM CCS 2017) uses locality-sensitive hashing and then releases a provably differential private representation of the hash bins. However, differential privacy still leaks some, although provable bounded information and does not protect against attacks, such as property inference attacks. Another recent approach by Khurram and Kerschbaum (IEEE ICDE 2020) uses locality-preserving hashing and provides cryptographic security, i.e., it releases no information except the output. However, locality-preserving hash functions are much harder to construct than locality-sensitive hash functions and hence accuracy of this approach is limited, particularly on larger datasets. In this paper, we address the open problem of providing cryptographic security of PRL while using locality-sensitive hash functions. Using recent results in oblivious algorithms, we design a new cryptographically secure PRL with locality-sensitive hash functions. Our prototypical implementation can match 40000 records in the British National Library/Toronto Public Library and the North Carolina Voter Registry datasets with 99.3% and 99.9% accuracy, respectively, in less than an hour which is more than an order of magnitude faster than Khurram and Kerschbaum's work at a higher accuracy.
引用
收藏
页码:79 / 91
页数:13
相关论文
共 50 条
  • [32] Large-Scale Distributed Learning via Private On-Device Locality-Sensitive Hashing
    Rabbani, Tahseen
    Bornstein, Marco
    Huang, Furong
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [33] Locality Sensitive Hashing with Temporal and Spatial Constraints for Efficient Population Record Linkage
    Nanayakkara, Charini
    Christen, Peter
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 4354 - 4358
  • [34] Efficient Outlier Detection in Hyperedge Streams Using MinHash and Locality-Sensitive Hashing
    Ranshous, Stephen
    Chaudhary, Mandar
    Samatova, Nagiza F.
    [J]. COMPLEX NETWORKS & THEIR APPLICATIONS VI, 2018, 689 : 105 - 116
  • [35] Reducing the Complexity of Fingerprinting-Based Positioning using Locality-Sensitive Hashing
    Tang, Larry
    Ghods, Ramina
    Studer, Christoph
    [J]. CONFERENCE RECORD OF THE 2019 FIFTY-THIRD ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2019, : 1086 - 1090
  • [36] Hardware acceleration of k-mer clustering using locality-sensitive hashing
    Soto, Javier E.
    Krohmer, Thomas
    Hernandez, Cecilia
    Figueroa, Miguel
    [J]. 2019 22ND EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN (DSD), 2019, : 659 - 662
  • [37] Fast alignment filtering of nanopore sequencing reads using locality-sensitive hashing
    Wang, Jeremy R.
    Jones, Corbin D.
    [J]. PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2015, : 127 - 130
  • [38] Improving Kernel Locality-Sensitive Hashing Using Pre-Images and Bounds
    Bodo, Zalan
    Csato, Lehel
    [J]. 2012 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2012,
  • [39] SFour: A Protocol for Cryptographically Secure Record Linkage at Scale
    Khurram, Basit
    Kerschbaum, Florian
    [J]. 2020 IEEE 36TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2020), 2020, : 277 - 288
  • [40] Faster Sieving for Shortest Lattice Vectors Using Spherical Locality-Sensitive Hashing
    Laarhoven, Thijs
    de Weger, Benne
    [J]. PROGRESS IN CRYPTOLOGY - LATINCRYPT 2015, 2015, 9230 : 101 - 118