Systematic review identifies the design and methodological conduct of studies on machine learning-based prediction models

被引:39
|
作者
Navarro, Constanza L. Andaur [1 ,2 ,6 ]
Damen, Johanna A. A. [1 ,2 ]
van Smeden, Maarten [1 ]
Takada, Toshihiko [1 ]
Nijman, Steven W. J. [1 ]
Dhiman, Paula [3 ,4 ]
Ma, Jie [3 ]
Collins, Gary S. [3 ,4 ]
Bajpai, Ram [5 ]
Riley, Richard D. [5 ]
Moons, Karel G. M. [1 ,2 ]
Hooft, Lotty [1 ,2 ]
机构
[1] Univ Utrecht, Univ Med Ctr Utrecht, Julius Ctr Hlth Sci & Primary Care, Utrecht, Netherlands
[2] Univ Utrecht, Univ Med Ctr Utrecht, Cochrane Netherlands, Utrecht, Netherlands
[3] Univ Oxford, Ctr Stat Med, Nuffield Dept Orthopaed Rheumatol & Musculoskeleta, Oxford, England
[4] Oxford Univ Hosp NHS Fdn Trust, NIHR Oxford Biomed Res Ctr, Oxford, England
[5] Keele Univ, Ctr Prognosis Res, Sch Med, Keele, England
[6] Julius Ctr Hlth Sci & Primary Care, Univ Weg 100,POB 8550, NL-3508 GA Utrecht, Netherlands
基金
澳大利亚研究理事会;
关键词
Predictive algorithm; Risk prediction; Diagnosis; Prognosis; Development; Validation; RISK; APPLICABILITY; EXPLANATION; VALIDATION; DIAGNOSIS; PROBAST; BIAS; TOOL;
D O I
10.1016/j.jclinepi.2022.11.015
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Background and Objectives: We sought to summarize the study design, modelling strategies, and performance measures reported in studies on clinical prediction models developed using machine learning techniques.Methods: We search PubMed for articles published between 01/01/2018 and 31/12/2019, describing the development or the development with external validation of a multivariable prediction model using any supervised machine learning technique. No restrictions were made based on study design, data source, or predicted patient-related health outcomes.Results: We included 152 studies, 58 (38.2% [95% CI 30.8-46.1]) were diagnostic and 94 (61.8% [95% CI 53.9-69.2]) prognostic studies. Most studies reported only the development of prediction models (n = 133, 87.5% [95% CI 81.3-91.8]), focused on binary outcomes (n = 131, 86.2% [95% CI 79.8-90.8), and did not report a sample size calculation (n = 125, 82.2% [95% CI 75.4-87.5]). The most common algorithms used were support vector machine (n = 86/522, 16.5% [95% CI 13.5-19.9]) and random forest (n = 73/522, 14% [95% CI 11.3-17.2]). Values for area under the Receiver Operating Characteristic curve ranged from 0.45 to 1.00. Calibration metrics were often missed (n 5 494/522, 94.6% [95% CI 92.4-96.3]).Conclusion: Our review revealed that focus is required on handling of missing values, methods for internal validation, and reporting of calibration to improve the methodological conduct of studies on machine learning-based prediction models.Systematic review registration: PROSPERO, CRD42019161764. (c) 2022 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
引用
收藏
页码:8 / 22
页数:15
相关论文
共 50 条
  • [31] Prediction of pasture yield using machine learning-based optical sensing: a systematic review
    Stumpe, Christoph
    Leukel, Joerg
    Zimpel, Tobias
    PRECISION AGRICULTURE, 2024, 25 (01) : 430 - 459
  • [32] Prediction of pasture yield using machine learning-based optical sensing: a systematic review
    Christoph Stumpe
    Joerg Leukel
    Tobias Zimpel
    Precision Agriculture, 2024, 25 : 430 - 459
  • [33] A review on machine learning-based models for lane-changing behavior prediction and recognition
    David, Ruth
    Soeffker, Dirk
    FRONTIERS IN FUTURE TRANSPORTATION, 2023, 4
  • [34] Protocol for a systematic review on the methodological and reporting quality of prediction model studies using machine learning techniques
    Navarro, Constanza L. Andaur
    Damen, Johanna A. A. G.
    Takada, Toshihiko
    Nijman, Steven W. J.
    Dhiman, Paula
    Ma, Jie
    Collins, Gary S.
    Bajpai, Ram
    Riley, Richard D.
    Moons, Karel G. M.
    Hooft, Lotty
    BMJ OPEN, 2020, 10 (11):
  • [35] A Systematic Literature Review of Learning-Based Traffic Accident Prediction Models Based on Heterogeneous Sources
    Marcillo, Pablo
    Valdivieso Caraguay, Angel Leonardo
    Hernandez-Alvarez, Myriam
    APPLIED SCIENCES-BASEL, 2022, 12 (09):
  • [36] Machine learning-based prediction models for accidental hypothermia patients
    Yohei Okada
    Tasuku Matsuyama
    Sachiko Morita
    Naoki Ehara
    Nobuhiro Miyamae
    Takaaki Jo
    Yasuyuki Sumida
    Nobunaga Okada
    Makoto Watanabe
    Masahiro Nozawa
    Ayumu Tsuruoka
    Yoshihiro Fujimoto
    Yoshiki Okumura
    Tetsuhisa Kitamura
    Ryoji Iiduka
    Shigeru Ohtsuru
    Journal of Intensive Care, 9
  • [37] Machine learning-based prediction models for accidental hypothermia patients
    Okada, Yohei
    Matsuyama, Tasuku
    Morita, Sachiko
    Ehara, Naoki
    Miyamae, Nobuhiro
    Jo, Takaaki
    Sumida, Yasuyuki
    Okada, Nobunaga
    Watanabe, Makoto
    Nozawa, Masahiro
    Tsuruoka, Ayumu
    Fujimoto, Yoshihiro
    Okumura, Yoshiki
    Kitamura, Tetsuhisa
    Iiduka, Ryoji
    Ohtsuru, Shigeru
    JOURNAL OF INTENSIVE CARE, 2021, 9 (01)
  • [38] Machine Learning-Based Models for Prediction of Toxicity Outcomes in Radiotherapy
    Isaksson, Lars J.
    Pepa, Matteo
    Zaffaroni, Mattia
    Marvaso, Giulia
    Alterio, Daniela
    Volpe, Stefania
    Corrao, Giulia
    Augugliaro, Matteo
    Starzynska, Anna
    Leonardi, Maria C.
    Orecchia, Roberto
    Jereczek-Fossa, Barbara A.
    FRONTIERS IN ONCOLOGY, 2020, 10
  • [39] Machine Learning-Based Models Enhance the Prediction of Prostate Cancer
    Chen, Sunmeng
    Jian, Tengteng
    Chi, Changliang
    Liang, Yi
    Liang, Xiao
    Yu, Ying
    Jiang, Fengming
    Lu, Ji
    FRONTIERS IN ONCOLOGY, 2022, 12
  • [40] Introducing machine learning-based prediction models in the perioperative setting
    Gogenur, Ismail
    BRITISH JOURNAL OF SURGERY, 2023, 110 (05) : 533 - 535