Privacy-preserving record linkage on large real world datasets

被引:74
|
作者
Randall, Sean M. [1 ]
Ferrante, Anna M. [1 ]
Boyd, James H. [1 ]
Bauer, Jacqueline K. [1 ]
Semmens, James B. [1 ]
机构
[1] Curtin Univ, Fac Hlth Sci, Ctr Populat Hlth Res, Bentley, WA 6102, Australia
关键词
Record linkage; Privacy preserving record linkage; Data integration; Bloom filters; Privacy preserving protocols; Population based research; HEALTH-SERVICES RESEARCH; AUSTRALIA; SECURITY;
D O I
10.1016/j.jbi.2013.12.003
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Record linkage typically involves the use of dedicated linkage units who are supplied with personally identifying information to determine individuals from within and across datasets. The personally identifying information supplied to linkage units is separated from clinical information prior to release by data custodians. While this substantially reduces the risk of disclosure of sensitive information, some residual risks still exist and remain a concern for some custodians. In this paper we trial a method of record linkage which reduces privacy risk still further on large real world administrative data. The method uses encrypted personal identifying information (bloom filters) in a probability-based linkage framework. The privacy preserving linkage method was tested on ten years of New South Wales (NSW) and Western Australian (WA) hospital admissions data, comprising in total over 26 million records. No difference in linkage quality was found when the results were compared to traditional probabilistic methods using full unencrypted personal identifiers. This presents as a possible means of reducing privacy risks related to record linkage in population level research studies. It is hoped that through adaptations of this method or similar privacy preserving methods, risks related to information disclosure can be reduced so that the benefits of linked research taking place can be fully realised. (C) 2013 Elsevier Inc. All rights reserved.
引用
收藏
页码:205 / 212
页数:8
相关论文
共 50 条
  • [1] Privacy-preserving record linkage
    Verykios, Vassilios S.
    Christen, Peter
    [J]. WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2013, 3 (05) : 321 - 332
  • [2] Privacy-Preserving Record Linkage
    Hall, Rob
    Fienberg, Stephen E.
    [J]. PRIVACY IN STATISTICAL DATABASES, 2010, 6344 : 269 - +
  • [3] Privacy-Preserving Record Linkage with Spark
    Valkering, Onno
    Belloum, Adam
    [J]. 2019 19TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2019, : 440 - 448
  • [4] Privacy-Preserving Temporal Record Linkage
    Ranbaduge, Thilina
    Christen, Peter
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 377 - 386
  • [5] Privacy-preserving record linkage using autoencoders
    Victor Christen
    Tim Häntschel
    Peter Christen
    Erhard Rahm
    [J]. International Journal of Data Science and Analytics, 2023, 15 : 347 - 357
  • [6] A taxonomy of privacy-preserving record linkage techniques
    Vatsalan, Dinusha
    Christen, Peter
    Verykios, Vassilios S.
    [J]. INFORMATION SYSTEMS, 2013, 38 (06) : 946 - 969
  • [7] Privacy-preserving record linkage using autoencoders
    Christen, Victor
    Haentschel, Tim
    Christen, Peter
    Rahm, Erhard
    [J]. INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2023, 15 (04) : 347 - 357
  • [8] Privacy-Preserving Record Linkage for Cardinality Counting
    Wu, Nan
    Vatsalan, Dinusha
    Kaafar, Mohamed Ali
    Ramesh, Sanath Kumar
    [J]. PROCEEDINGS OF THE 2023 ACM ASIA CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, ASIA CCS 2023, 2023, : 53 - 64
  • [9] Revisiting distance-based record linkage for privacy-preserving release of statistical datasets
    Herranz, Javier
    Nin, Jordi
    Rodriguez, Pablo
    Tassa, Tamir
    [J]. DATA & KNOWLEDGE ENGINEERING, 2015, 100 : 78 - 93
  • [10] Towards Privacy-Preserving Record Linkage with Record-Wise Linkage Policy
    Kaiho, Takahito
    Lu, Wen-jie
    Amagasa, Toshiyuki
    Sakuma, Jun
    [J]. DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2017, PT I, 2017, 10438 : 233 - 248