Improving Consensus Scoring of Crowdsourced Data Using the Rasch Model: Development and Refinement of a Diagnostic Instrument

被引:10
|
作者
Brady, Christopher John [1 ]
Mudie, Lucy Iluka [1 ]
Wang, Xueyang [1 ]
Guallar, Eliseo [2 ]
Friedman, David Steven [1 ,2 ]
机构
[1] Johns Hopkins Univ, Sch Med, Wilmer Eye Inst, Dana Ctr Prevent Ophthalmol, 600 N Wolfe St, Baltimore, MD 21205 USA
[2] Johns Hopkins Univ, Dept Epidemiol, Bloomberg Sch Publ Hlth, Baltimore, MD USA
基金
美国国家卫生研究院;
关键词
crowdsourcing; diabetic retinopathy; Rasch analysis; Amazon Mechanical Turk; DIABETIC-RETINOPATHY; TELEMEDICINE; RISK; MELLITUS; IMAGES;
D O I
10.2196/jmir.7984
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Background: Diabetic retinopathy (DR) is a leading cause of vision loss in working age individuals worldwide. While screening is effective and cost effective, it remains underutilized, and novel methods are needed to increase detection of DR. This clinical validation study compared diagnostic gradings of retinal fundus photographs provided by volunteers on the Amazon Mechanical Turk (AMT) crowdsourcing marketplace with expert-provided gold-standard grading and explored whether determination of the consensus of crowdsourced classifications could be improved beyond a simple majority vote (MV) using regression methods. Objective: The aim of our study was to determine whether regression methods could be used to improve the consensus grading of data collected by crowdsourcing. Methods: A total of 1200 retinal images of individuals with diabetes mellitus from the Messidor public dataset were posted to AMT. Eligible crowdsourcing workers had at least 500 previously approved tasks with an approval rating of 99% across their prior submitted work. A total of 10 workers were recruited to classify each image as normal or abnormal. If half or more workers judged the image to be abnormal, the MV consensus grade was recorded as abnormal. Rasch analysis was then used to calculate worker ability scores in a random 50% training set, which were then used as weights in a regression model in the remaining 50% test set to determine if a more accurate consensus could be devised. Outcomes of interest were the percent correctly classified images, sensitivity, specificity, and area under the receiver operating characteristic (AUROC) for the consensus grade as compared with the expert grading provided with the dataset. Results: Using MV grading, the consensus was correct in 75.5% of images (906/1200), with 75.5% sensitivity, 75.5% specificity, and an AUROC of 0.75 (95% CI 0.73-0.78). A logistic regression model using Rasch-weighted individual scores generated an AUROC of 0.91 (95% CI 0.88-0.93) compared with 0.89 (95% CI 0.86-92) for a model using unweighted scores (chi-square P value<.001). Setting a diagnostic cut-point to optimize sensitivity at 90%, 77.5% (465/600) were graded correctly, with 90.3% sensitivity, 68.5% specificity, and an AUROC of 0.79 (95% CI 0.76-0.83). Conclusions: Crowdsourced interpretations of retinal images provide rapid and accurate results as compared with a gold-standard grading. Creating a logistic regression model using Rasch analysis to weight crowdsourced classifications by worker ability improves accuracy of aggregated grades as compared with simple majority vote.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] The Development of a Secondary-Level Solo Wind Instrument Performance Rubric Using the Multifaceted Rasch Partial Credit Measurement Model
    Wesolowski, Brian C.
    Amend, Ross M.
    Barnstead, Thomas S.
    Edwards, Andrew S.
    Everhart, Matthew
    Goins, Quentin R.
    Grogan, Robert J.
    Herceg, Amanda M.
    Jenkins, S. Ira
    Johns, Paul M.
    McCarver, Christopher J.
    Schaps, Robin E.
    Sorrell, Gary W.
    Williams, Jonathan D.
    JOURNAL OF RESEARCH IN MUSIC EDUCATION, 2017, 65 (01) : 95 - 119
  • [32] Development of WSAPS for Elementary School Students Using Rasch Model
    Kim, Sae-Hyung
    Kim, Tae-Gyu
    RESEARCH QUARTERLY FOR EXERCISE AND SPORT, 2018, 89 : A53 - A54
  • [33] Development and Validation of a Teacher Success Questionnaire Using the Rasch Model
    Tabatabaee-Yazdi, Mona
    Motallebzadeh, Khalil
    Ashraf, Hamid
    Baghaei, Purya
    INTERNATIONAL JOURNAL OF INSTRUCTION, 2018, 11 (02) : 129 - 144
  • [34] Improving Search Quality in Crowdsourced Bib Number Tagging Systems Using Data Fusion
    Ponomarev, Andrew
    INFORMATION, 2020, 11 (08)
  • [35] Assessing Mechanisms of Mindfulness: Improving the Precision of the Nonattachment Scale Using a Rasch Model
    Feng, Xuan Joanna
    Krageloh, Christian U.
    Medvedev, Oleg N.
    Billington, D. Rex
    Jang, Jin Young
    Siegert, Richard J.
    MINDFULNESS, 2016, 7 (05) : 1082 - 1091
  • [36] Assessing Mechanisms of Mindfulness: Improving the Precision of the Nonattachment Scale Using a Rasch Model
    Xuan Joanna Feng
    Christian U. Krägeloh
    Oleg N. Medvedev
    D. Rex Billington
    Jin Young Jang
    Richard J. Siegert
    Mindfulness, 2016, 7 : 1082 - 1091
  • [37] Bicycle Ridership Using Crowdsourced Data: Ordered Probit Model Approach
    Li, Zijing
    Fan, Wei 'David'
    JOURNAL OF TRANSPORTATION ENGINEERING PART A-SYSTEMS, 2020, 146 (08)
  • [38] Students' interest in particle physics: conceptualisation, instrument development, and evaluation using Rasch theory and analysis
    Zoechling, Sarah
    Hopf, Martin
    Woithe, Julia
    Schmeling, Sascha
    INTERNATIONAL JOURNAL OF SCIENCE EDUCATION, 2022, 44 (15) : 2353 - 2380
  • [39] Using a cognitive model to understand crowdsourced data from citizen scientists
    Thorpe, Alex
    Kelly, Oliver
    Callen, Alex
    Griffin, Andrea S.
    Brown, Scott D.
    BEHAVIOR RESEARCH METHODS, 2024, 56 (04) : 3589 - 3605
  • [40] Development and validation of a sepsis diagnostic scoring model for neonates with suspected sepsis
    Sokou, Rozeta
    Ioakeimidis, Georgios
    Piovani, Daniele
    Parastatidou, Stavroula
    Konstantinidi, Aikaterini
    Tsantes, Andreas G.
    Lampridou, Maria
    Houhoula, Dimitra
    Iacovidou, Nicoletta
    Kokoris, Styliani
    Vaiopoulos, Aristeidis G.
    Gialeraki, Argyri
    Kopterides, Petros
    Bonovas, Stefanos
    Tsantes, Argirios E.
    FRONTIERS IN PEDIATRICS, 2022, 10