Use of name recognition software, census data and multiple imputation to predict missing data on ethnicity: application to cancer registry records

被引:34
|
作者
Ryan, Ronan [1 ]
Vernon, Sally [2 ]
Lawrence, Gill [1 ]
Wilson, Sue
机构
[1] Univ Birmingham, W Midlands Canc Intelligence Unit, Birmingham B15 2TT, W Midlands, England
[2] Eastern Canc Registrat & Informat Ctr, Cambridge CB22 3AD, England
关键词
VALIDATION; BREAST;
D O I
10.1186/1472-6947-12-3
中图分类号
R-058 [];
学科分类号
摘要
Background: Information on ethnicity is commonly used by health services and researchers to plan services, ensure equality of access, and for epidemiological studies. In common with other important demographic and clinical data it is often incompletely recorded. This paper presents a method for imputing missing data on the ethnicity of cancer patients, developed for a regional cancer registry in the UK. Methods: Routine records from cancer screening services, name recognition software (Nam Pehchan and Onomap), 2001 national Census data, and multiple imputation were used to predict the ethnicity of the 23% of cases that were still missing following linkage with self-reported ethnicity from inpatient hospital records. Results: The name recognition software were good predictors of ethnicity for South Asian cancer cases when compared with data on ethnicity derived from hospital inpatient records, especially when combined (sensitivity 90.5%; specificity 99.9%; PPV 93.3%). Onomap was a poor predictor of ethnicity for other minority ethnic groups (sensitivity 4.4% for Black cases and 0.0% for Chinese/Other ethnic groups). Area-based data derived from the national Census was also a poor predictor non-White ethnicity (sensitivity: South Asian 7.4%; Black 2.3%; Chinese/Other 0.0%; Mixed 0.0%). Conclusions: Currently, neither method for assigning individuals to an ethnic group (name recognition and ethnic distribution of area of residence) performs well across all ethnic groups. We recommend further development of name recognition applications and the identification of additional methods for predicting ethnicity to improve their precision and accuracy for comparisons of health outcomes. However, real improvements can only come from better recording of ethnicity by health services.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Use of name recognition software, census data and multiple imputation to predict missing data on ethnicity: application to cancer registry records
    Ronan Ryan
    Sally Vernon
    Gill Lawrence
    Sue Wilson
    [J]. BMC Medical Informatics and Decision Making, 12
  • [2] The use of multiple imputation for the analysis of missing data
    Sinharay, S
    Stern, HS
    Russell, D
    [J]. PSYCHOLOGICAL METHODS, 2001, 6 (04) : 317 - 329
  • [3] Multiple Imputation of Missing Data in Longitudinal Electronic Health Records
    Petersen, Irene
    Welch, Catherine
    Bartlett, Jonathan
    Morris, Richard
    Walters, Kate
    Nazareth, Irwin
    Marston, Louise
    White, Ian
    Carpenter, James
    [J]. PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2013, 22 : 302 - 302
  • [4] Application of Multiple Imputation Method for Missing Data Estimation
    Ser, Gazel
    [J]. GAZI UNIVERSITY JOURNAL OF SCIENCE, 2012, 25 (04): : 869 - 873
  • [5] Proper Use of Multiple Imputation and Dealing with Missing Covariate Data
    Saffari, Seyed Ehsan
    Volovici, Victor
    Ong, Marcus Eng Hock
    Goldstein, Benjamin Alan
    Vaughan, Roger
    Dammers, Ruben
    Steyerberg, Ewout W.
    Liu, Nan
    [J]. WORLD NEUROSURGERY, 2022, 161 : 284 - 290
  • [6] Application of Multiple Imputation Method in Analyzing Data with Missing Continuous Covariates
    Tamar, S. Ghasemizadeh
    Ganjali, M.
    [J]. KOREAN JOURNAL OF APPLIED STATISTICS, 2008, 21 (04) : 659 - 664
  • [7] Analysis of Missing Data in Progressed Learners: The Use of Multiple Imputation Methods
    Mabungane, S.
    Ramroop, S.
    Mwambi, H.
    [J]. AFRICAN JOURNAL OF RESEARCH IN MATHEMATICS SCIENCE AND TECHNOLOGY EDUCATION, 2023, 27 (02) : 112 - 122
  • [8] Reporting the Use of Multiple Imputation for Missing Data in Higher Education Research
    Catherine A. Manly
    Ryan S. Wells
    [J]. Research in Higher Education, 2015, 56 : 397 - 409
  • [9] Reporting the Use of Multiple Imputation for Missing Data in Higher Education Research
    Manly, Catherine A.
    Wells, Ryan S.
    [J]. RESEARCH IN HIGHER EDUCATION, 2015, 56 (04) : 397 - 409
  • [10] Selecting the model for multiple imputation of missing data: Just use an IC!
    Noghrehchi, Firouzeh
    Stoklosa, Jakub
    Penev, Spiridon
    Warton, David I.
    [J]. STATISTICS IN MEDICINE, 2021, 40 (10) : 2467 - 2497