Protecting Confidentiality in Cancer Registry Data With Geographic Identifiers

被引:7
|
作者
Yu, Mandi [1 ]
Reiter, Jerome Phillip [2 ]
Zhu, Li [1 ]
Liu, Benmei [1 ]
Cronin, Kathleen A. [1 ]
Feuer, Eric J. [1 ]
机构
[1] NCI, Surveillance Res Program, Div Canc Control & Populat Sci, 9609 Med Ctr Dr,Room 4E544, Rockville, MD 20850 USA
[2] Duke Univ, Dept Stat Sci, Trinity Coll Arts & Sci, Durham, NC USA
关键词
breast cancer; classification and regression trees; health disparities; multiple imputation; partial synthetic data; Surveillance; Epidemiology; and End Results Program; BREAST-CANCER; SOCIOECONOMIC INEQUALITIES; HEALTH DISPARITIES; IMPUTATION; PRIVACY;
D O I
10.1093/aje/kwx050
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
The National Cancer Institute's Surveillance, Epidemiology, and End Results Program releases research files of cancer registry data. These files include geographic information at the county level, but no finer. Access to finer geography, such as census tract identifiers, would enable richer analyses-for example, examination of health disparities across neighborhoods. To date, tract identifiers have been left off the research files because they could compromise the confidentiality of patients' identities. We present an approach to inclusion of tract identifiers based on multiply imputed, synthetic data. The idea is to build a predictive model of tract locations, given patient and tumor characteristics, and randomly simulate the tract of each patient by sampling from this model. For the predictive model, we use multivariate regression trees fitted to the latitude and longitude of the population centroid of each tract. We implement the approach in the registry data from California. The method results in synthetic data that reproduce a wide range (but not all) of analyses of census tract socioeconomic cancer disparities and have relatively low disclosure risks, which we assess by comparing individual patients' actual and synthetic tract locations. We conclude with a discussion of how synthetic data sets can be used by researchers with cancer registry data.
引用
收藏
页码:83 / 91
页数:9
相关论文
共 50 条
  • [1] Using a composite index of socioeconomic status to investigate health disparities while protecting the confidentiality of cancer registry data
    Yu, Mandi
    Tatalovich, Zaria
    Gibson, James T.
    Cronin, Kathleen A.
    [J]. CANCER CAUSES & CONTROL, 2014, 25 (01) : 81 - 92
  • [2] Using a composite index of socioeconomic status to investigate health disparities while protecting the confidentiality of cancer registry data
    Mandi Yu
    Zaria Tatalovich
    James T. Gibson
    Kathleen A. Cronin
    [J]. Cancer Causes & Control, 2014, 25 : 81 - 92
  • [3] CONFIDENTIALITY IN THE CANCER REGISTRY
    COLEMAN, MP
    MUIR, CS
    MENEGOZ, F
    [J]. BRITISH JOURNAL OF CANCER, 1992, 66 (06) : 1138 - 1149
  • [4] Protecting integrity and confidentiality for data communication
    Hou, FY
    Wang, ZY
    Tang, YH
    Liu, Z
    [J]. ISCC2004: NINTH INTERNATIONAL SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS, VOLS 1 AND 2, PROCEEDINGS, 2004, : 357 - 362
  • [5] Protecting access confidentiality with data distribution and swapping
    di Vimercati, Sabrina De Capitani
    Foresti, Sara
    Paraboschi, Stefano
    Pelosi, Gerardo
    Samarati, Pierangela
    [J]. 2014 IEEE FOURTH INTERNATIONAL CONFERENCE ON BIG DATA AND CLOUD COMPUTING (BDCLOUD), 2014, : 167 - 174
  • [6] PROTECTING CONFIDENTIALITY
    HAVARD, JDJ
    [J]. BRITISH MEDICAL JOURNAL, 1984, 288 (6424): : 1102 - 1103
  • [7] Protecting farmers' data privacy and confidentiality: Recommendations and considerations
    Kaur, Jasmin
    Fard, Seyed Mehdi Hazrati
    Amiri-Zarandi, Mohammad
    Dara, Rozita
    [J]. FRONTIERS IN SUSTAINABLE FOOD SYSTEMS, 2022, 6
  • [8] Protecting the confidentiality of interim data: Addressing current challenges
    Fleming, Thomas R.
    [J]. CLINICAL TRIALS, 2015, 12 (01) : 5 - 11
  • [9] CERTIFICATES OF CONFIDENTIALITY - A VALUABLE TOOL FOR PROTECTING GENETIC DATA
    EARLEY, CL
    STRONG, LC
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 1995, 57 (03) : 727 - 731
  • [10] CryptSQLite: Protecting Data Confidentiality of SQLite with Intel SGX
    Wang, Yongzhi
    Liu, Lingtong
    Su, Cuicui
    Ma, Jiawen
    Wang, Lei
    Yang, Yibo
    Shen, Yulong
    Li, Guangxia
    Zhang, Tao
    Dong, Xuewen
    [J]. 2017 INTERNATIONAL CONFERENCE ON NETWORKING AND NETWORK APPLICATIONS (NANA), 2017, : 303 - 308