A scoping review of the clinical application of machine learning in data-driven population segmentation analysis

被引:4
|
作者
Liu, Pinyan [1 ]
Wang, Ziwen [1 ]
Liu, Nan [1 ,2 ,3 ]
Peres, Marco Aurelio
机构
[1] Duke NUS Med Sch, Ctr Quantitat Med, Singapore, Singapore
[2] Duke NUS Med Sch, Programme Hlth Serv & Syst Res, Singapore, Singapore
[3] Natl Univ Singapore, Inst Data Sci, Singapore, Singapore
关键词
population segmentation; machine learning; data analytics; population health; health services research; HEALTH DATA; SUBGROUPS; PATTERNS; CLUSTERS; CARE; PROFILES; BENEFITS;
D O I
10.1093/jamia/ocad111
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective Data-driven population segmentation is commonly used in clinical settings to separate the heterogeneous population into multiple relatively homogenous groups with similar healthcare features. In recent years, machine learning (ML) based segmentation algorithms have garnered interest for their potential to speed up and improve algorithm development across many phenotypes and healthcare situations. This study evaluates ML-based segmentation with respect to (1) the populations applied, (2) the segmentation details, and (3) the outcome evaluations. Materials and Methods MEDLINE, Embase, Web of Science, and Scopus were used following the PRISMA-ScR criteria. Peer-reviewed studies in the English language that used data-driven population segmentation analysis on structured data from January 2000 to October 2022 were included. Results We identified 6077 articles and included 79 for the final analysis. Data-driven population segmentation analysis was employed in various clinical settings. K-means clustering is the most prevalent unsupervised ML paradigm. The most common settings were healthcare institutions. The most common targeted population was the general population. Discussion Although all the studies did internal validation, only 11 papers (13.9%) did external validation, and 23 papers (29.1%) conducted methods comparison. The existing papers discussed little validating the robustness of ML modeling. Conclusion Existing ML applications on population segmentation need more evaluations regarding giving tailored, efficient integrated healthcare solutions compared to traditional segmentation analysis. Future ML applications in the field should emphasize methods' comparisons and external validation and investigate approaches to evaluate individual consistency using different methods.
引用
收藏
页码:1573 / 1582
页数:10
相关论文
共 50 条
  • [1] A systematic review of the clinical application of data-driven population segmentation analysis
    Yan, Shi
    Kwan, Yu Heng
    Tan, Chuen Seng
    Thumboo, Julian
    Low, Lian Leng
    [J]. BMC MEDICAL RESEARCH METHODOLOGY, 2018, 18
  • [2] A systematic review of the clinical application of data-driven population segmentation analysis
    Shi Yan
    Yu Heng Kwan
    Chuen Seng Tan
    Julian Thumboo
    Lian Leng Low
    [J]. BMC Medical Research Methodology, 18
  • [3] Data-driven market segmentation in hospitality using unsupervised machine learning
    van Leeuwen, Rik
    Koole, Ger
    [J]. MACHINE LEARNING WITH APPLICATIONS, 2022, 10
  • [4] Data-driven overdiagnosis definitions: A scoping review
    Senevirathna, Prabodi
    Pires, Douglas E. V.
    Capurro, Daniel
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2023, 147
  • [5] Trust and acceptability of data-driven clinical recommendations in everyday practice: A scoping review
    Evans, Ruth P.
    Bryant, Louise D.
    Russell, Gregor
    Absolom, Kate
    [J]. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2024, 183
  • [6] Review of Challenges and Opportunities in Turbulence Modeling: A Comparative Analysis of Data-Driven Machine Learning Approaches
    Zhang, Yi
    Zhang, Dapeng
    Jiang, Haoyu
    [J]. JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2023, 11 (07)
  • [7] Prospects and Challenges of the Machine Learning and Data-Driven Methods for the Predictive Analysis of Power Systems: A Review
    Strielkowski, Wadim
    Vlasov, Andrey
    Selivanov, Kirill
    Muraviev, Konstantin
    Shakhnov, Vadim
    [J]. ENERGIES, 2023, 16 (10)
  • [8] Data-Driven Suitability Analysis to Enable Machine Learning Explainability and Security
    Wolf, Shaya
    Foster, Rita
    Haile, Jed
    Borowczak, Mike
    [J]. 2021 RESILIENCE WEEK (RWS), 2021,
  • [9] Sensitivity Analysis of the Composite Data-Driven Pipelines in the Automated Machine Learning
    Barabanova, Irina, V
    Vychuzhanin, Pavel
    Nikitin, Nikolay O.
    [J]. 10TH INTERNATIONAL YOUNG SCIENTISTS CONFERENCE IN COMPUTATIONAL SCIENCE (YSC2021), 2021, 193 : 484 - 493
  • [10] Novel Big Data-Driven Machine Learning Models for Drug Discovery Application
    Sripriya Akondi, Vishnu
    Menon, Vineetha
    Baudry, Jerome
    Whittle, Jana
    [J]. MOLECULES, 2022, 27 (03):