Recommendations for the development and use of imaging test sets to investigate the test performance of artificial intelligence in health screening

被引:8
|
作者
Chalkidou A. [1 ]
Shokraneh F. [1 ]
Kijauskaite G. [2 ]
Taylor-Phillips S. [3 ]
Halligan S. [4 ]
Wilkinson L. [5 ]
Glocker B. [6 ]
Garrett P. [7 ]
Denniston A.K. [8 ]
Mackie A. [2 ]
Seedat F. [2 ]
机构
[1] King's Technology Evaluation Centre, King's College London, London
[2] UK National Screening Committee, Office for Health Improvement and Disparities, Department of Health and Social Care, London
[3] Warwick Medical School, University of Warwick, Coventry
[4] Centre for Medical Imaging, Division of Medicine, University College London, London
[5] Oxford Breast Imaging Centre, Oxford University, Oxford
[6] Department of Computing, Imperial College London, London
[7] Department of Chemical Engineering and Analytical Science, University of Manchester, Manchester
[8] Department of Ophthalmology, University Hospitals Birmingham NHS Foundation Trust, Birmingham
来源
The Lancet Digital Health | 2022年 / 4卷 / 12期
关键词
D O I
10.1016/S2589-7500(22)00186-8
中图分类号
学科分类号
摘要
Rigorous evaluation of artificial intelligence (AI) systems for image classification is essential before deployment into health-care settings, such as screening programmes, so that adoption is effective and safe. A key step in the evaluation process is the external validation of diagnostic performance using a test set of images. We conducted a rapid literature review on methods to develop test sets, published from 2012 to 2020, in English. Using thematic analysis, we mapped themes and coded the principles using the Population, Intervention, and Comparator or Reference standard, Outcome, and Study design framework. A group of screening and AI experts assessed the evidence-based principles for completeness and provided further considerations. From the final 15 principles recommended here, five affect population, one intervention, two comparator, one reference standard, and one both reference standard and comparator. Finally, four are appliable to outcome and one to study design. Principles from the literature were useful to address biases from AI; however, they did not account for screening specific biases, which we now incorporate. The principles set out here should be used to support the development and use of test sets for studies that assess the accuracy of AI within screening programmes, to ensure they are fit for purpose and minimise bias. © 2022 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 license
引用
下载
收藏
页码:e899 / e905
页数:6
相关论文
共 50 条
  • [31] Recommendations to overcome barriers to the use of artificial intelligence-driven evidence in health technology assessment
    Zemplenyi, Antal
    Tachkov, Konstantin
    Balkanyi, Laszlo
    Nemeth, Bertalan
    Petyko, Zsuzsanna Ida
    Petrova, Guenka
    Czech, Marcin
    Dawoud, Dalia
    Goettsch, Wim
    Ibarluzea, Inaki Gutierrez
    Hren, Rok
    Knies, Saskia
    Lorenzovici, Laszlo
    Maravic, Zorana
    Piniazhko, Oresta
    Savova, Alexandra
    Manova, Manoela
    Tesar, Tomas
    Zerovnik, Spela
    Kalo, Zoltan
    FRONTIERS IN PUBLIC HEALTH, 2023, 11
  • [32] Urinary test use for cancer screening: An underestimated health economics pitfall?
    Panou, C
    Alevizaki, P
    Mauri, D
    Ioakimidou, A
    Vittoraki, A
    Spiliopoulou, A
    Chasioti, D
    Loukidou, E
    Kouris, G
    Pentheroudakis, G
    JOURNAL OF LABORATORY AND CLINICAL MEDICINE, 2004, 143 (06): : 366 - 367
  • [33] EASY DEVELOPMENTAL SCREENING-TEST FOR PUBLIC-HEALTH USE
    ARMISTEAD, LM
    CRAWFORD, EE
    AMERICAN JOURNAL OF PUBLIC HEALTH, 1974, 64 (03) : 241 - 244
  • [34] Ensuring Adequate Development and Appropriate Use of Artificial Intelligence in Pediatric Medical Imaging
    Sammer, Marla B. K.
    Sher, Andrew C.
    Towbin, Alexander J.
    AMERICAN JOURNAL OF ROENTGENOLOGY, 2022, 218 (01) : 182 - 183
  • [35] In search of a measure to investigate mental performance among children: Development of the mental speed test
    Avci-Dogan, Gulsah
    Akbulut, Yavuz
    Sak, Ugur
    INFANT AND CHILD DEVELOPMENT, 2023, 32 (04)
  • [36] Automated Critical Test Findings Identification and Online Notification System Using Artificial Intelligence in Imaging
    Prevedello, Luciano M.
    Erdal, Barbaros S.
    Ryu, John L.
    Little, Kevin J.
    Demirer, Mutlu
    Qian, Songyue
    White, Richard D.
    RADIOLOGY, 2017, 285 (03) : 923 - 931
  • [37] Development of an easy-to-use spanish health literacy test
    Lee, Shoou-Yih D.
    Bender, Deborah E.
    Ruiz, Rafael E.
    Cho, Young Ik
    HEALTH SERVICES RESEARCH, 2006, 41 (04) : 1392 - 1412
  • [38] INCREMENTAL YIELD OF ARTIFICIAL INTELLIGENCE IN A FECAL OCCULT BLOOD TEST BASED ORGANIZED SCREENING POPULATION PROGRAM
    Pesatori, E., V
    Milluzzo, S. M.
    Cesaro, P.
    Piccirelli, S.
    Catino, F.
    Quadarella, A.
    Olivari, N.
    Grazioli, Minelli L.
    Codazzi, M.
    Bizzotto, A.
    Hassan, C.
    Spada, C.
    DIGESTIVE AND LIVER DISEASE, 2022, 54 : S67 - S67
  • [39] INCREMENTAL YIELD OF ARTIFICIAL INTELLIGENCE IN A FECAL OCCULT BLOOD TEST BASED ORGANIZED SCREENING POPULATION PROGRAM
    Pesatori, Eugenia V.
    Milluzzo, Sebastian Manuel
    Cesaro, Paola
    Piccirelli, Stefania
    Catino, Federico
    Quadarella, Alessandro
    Olivari, Nicola
    Grazioli, Leonardo Minelli
    Codazzi, Manuela
    Bizzotto, Alessandra
    Hassan, Cesare
    Spada, Cristiano
    GASTROINTESTINAL ENDOSCOPY, 2022, 95 (06) : AB243 - AB244
  • [40] Regulatory Frameworks for Development and Evaluation of Artificial Intelligence-Based Diagnostic Imaging Algorithms: Summary and Recommendations
    Larson, David B.
    Harvey, Hugh
    Rubin, Daniel L.
    Irani, Neville
    Tse, Justin R.
    Langlotz, Curtis P.
    JOURNAL OF THE AMERICAN COLLEGE OF RADIOLOGY, 2021, 18 (03) : 413 - 424