Responsible Data Sharing: Identifying and Remedying Possible Re-Identification of Human Participants

被引:2
|
作者
Morehouse, Kirsten N. [1 ]
Kurdi, Benedek [2 ]
Nosek, Brian A. [3 ,4 ]
机构
[1] Harvard Univ, Dept Psychol, 33 Kirkland St, Cambridge, MA 02138 USA
[2] Univ Illinois, Dept Psychol, Champaign, IL USA
[3] Univ Virginia, Dept Psychol, Charlottesville, VA USA
[4] Ctr Open Sci, Charlottesville, VA USA
关键词
privacy; open science; data anonymity; re-identification; research integrity; DE-IDENTIFICATION; K-ANONYMITY;
D O I
10.1037/amp0001346
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
Open data collected from research participants creates a tension between scholarly values of transparency and sharing, on the one hand, and privacy and security, on the other hand. A common solution is to make data sets anonymous by removing personally identifying information (e.g., names or worker IDs) before sharing. However, ostensibly anonymized data sets may be at risk of re-identification if they include demographic information. In the present article, we provide researchers with broadly applicable guidance and tangible tools so that they can engage in open science practices without jeopardizing participants' privacy. Specifically, we (a) review current privacy standards, (b) describe computer science data protection frameworks and their adaptability to the social sciences, (c) provide practical guidance for assessing and addressing re-identification risk, (d) introduce two open-source algorithms developed for psychological scientists-MinBlur and MinBlurLite-to increase privacy while maintaining the integrity of open data, and (e) highlight aspects of ethical data sharing that require further attention. Ultimately, the risk of re-identification should not dissuade engagement with open science practices. Instead, technical innovations should be developed and harnessed so that science can be as open as possible to promote transparency and sharing and as closed as necessary to maintain privacy and security.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Measuring re-identification risk using a synthetic estimator to enable data sharing
    Jiang, Yangdi
    Mosquera, Lucy
    Jiang, Bei
    Kong, Linglong
    El Emam, Khaled
    [J]. PLOS ONE, 2022, 17 (06):
  • [2] The effect of kinship in re-identification attacks against genomic data sharing beacons
    Ayoz, Kerem
    Aysen, Miray
    Ayday, Erman
    Cicek, A. Ercument
    [J]. BIOINFORMATICS, 2020, 36 : I903 - I910
  • [3] Data Re-Identification: Prioritize Privacy
    Gutmann, Amy
    [J]. SCIENCE, 2013, 339 (6123) : 1032 - 1032
  • [4] Legal Limits to Data Re-Identification
    Wilson, Stephen
    [J]. SCIENCE, 2013, 339 (6120) : 647 - 647
  • [5] Data Re-Identification: Protect the Children
    Gurwitz, David
    [J]. SCIENCE, 2013, 339 (6123) : 1033 - 1033
  • [6] Re-identification of individuals in genomic data-sharing beacons via allele inference
    von Thenen, Nora
    Ayday, Erman
    Cicek, A. Ercument
    [J]. BIOINFORMATICS, 2019, 35 (03) : 365 - 371
  • [7] Data Re-Identification: Societal Safeguards
    Altman, Russ B.
    Clayton, Ellen Wright
    Kohane, Isaac S.
    Malin, Bradley A.
    Roden, Dan M.
    [J]. SCIENCE, 2013, 339 (6123) : 1032 - 1033
  • [8] Re-identification of Smart Meter data
    Buchmann, Erik
    Boehm, Klemens
    Burghardt, Thorben
    Kessler, Stephan
    [J]. PERSONAL AND UBIQUITOUS COMPUTING, 2013, 17 (04) : 653 - 662
  • [9] Re-identification of Smart Meter data
    Erik Buchmann
    Klemens Böhm
    Thorben Burghardt
    Stephan Kessler
    [J]. Personal and Ubiquitous Computing, 2013, 17 : 653 - 662
  • [10] Between Minimal and Greater Than Minimal Risk: How Research Participants and Oncologists Assess Data-Sharing and the Risk of Re-identification in Genomic Research
    Schleidgen S.
    Husedzinovic A.
    Ose D.
    Schickhardt C.
    von Kalle C.
    Winkler E.C.
    [J]. Philosophy & Technology, 2019, 32 (1) : 39 - 55