Secure pseudonymisation for privacy-preserving probabilistic record linkage

被引:15
|
作者
Smith, D. [1 ]
机构
[1] Univ Manchester, Sch Social Sci, Humanities Bridgeford St,Oxford Rd, Manchester M13 9PL, Lancs, England
关键词
Record linkage; Privacy; Bloom filter; Hash function;
D O I
10.1016/j.jisa.2017.01.002
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Record linkage is becoming an increasingly important tool in many areas of research - particularly medical research, where the relevant data often reside in more than one location. In the absence of a reliable and unique identifier probabilistic approaches to linkage are often employed. This linkage generally exploits the information contained in the fields that are common to a record pair. In classical record linkage the values in common fields are simply compared for equality. As values might contain typographical ( or other) errors the performance of classical record linkage can often be significantly improved if similarities between value pairs are also exploited. In applications where the data used for matching must be kept private the raw values are replaced by pseudonyms. For better linkage performance these pseudonyms should also convey information regarding similarities. Existing approaches are often based on Bloom filters, yet these are susceptible to attack. Secure schemes based on Bloom filters inevitably involve additional security measures. Here we introduce a new scheme that produces pseudonyms that are far more secure than Bloom filters. It can be used a drop-in replacement for many schemes that use Bloom filters. The new scheme allows similarity scores to be estimated from pairs of pseudonyms with negligible bias and with known variance for a given similarity score. (C) 2017 Elsevier Ltd. All rights reserved.
引用
收藏
页码:271 / 279
页数:9
相关论文
共 50 条
  • [1] Privacy-preserving record linkage
    Verykios, Vassilios S.
    Christen, Peter
    [J]. WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2013, 3 (05) : 321 - 332
  • [2] Secure Approximate String Matching for Privacy-Preserving Record Linkage
    Essex, Aleksander
    [J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2019, 14 (10) : 2623 - 2632
  • [3] Privacy-Preserving Record Linkage
    Hall, Rob
    Fienberg, Stephen E.
    [J]. PRIVACY IN STATISTICAL DATABASES, 2010, 6344 : 269 - +
  • [4] Privacy-Preserving Record Linkage with Spark
    Valkering, Onno
    Belloum, Adam
    [J]. 2019 19TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2019, : 440 - 448
  • [5] Privacy-Preserving Temporal Record Linkage
    Ranbaduge, Thilina
    Christen, Peter
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 377 - 386
  • [6] Privacy-preserving record linkage in large databases using secure multiparty computation
    Laud, Peeter
    Pankova, Alisa
    [J]. BMC MEDICAL GENOMICS, 2018, 11
  • [7] Privacy-preserving record linkage in large databases using secure multiparty computation
    Peeter Laud
    Alisa Pankova
    [J]. BMC Medical Genomics, 11
  • [8] Privacy-preserving record linkage using autoencoders
    Victor Christen
    Tim Häntschel
    Peter Christen
    Erhard Rahm
    [J]. International Journal of Data Science and Analytics, 2023, 15 : 347 - 357
  • [9] A taxonomy of privacy-preserving record linkage techniques
    Vatsalan, Dinusha
    Christen, Peter
    Verykios, Vassilios S.
    [J]. INFORMATION SYSTEMS, 2013, 38 (06) : 946 - 969
  • [10] Privacy-preserving record linkage using autoencoders
    Christen, Victor
    Haentschel, Tim
    Christen, Peter
    Rahm, Erhard
    [J]. INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2023, 15 (04) : 347 - 357