A scoping review of the clinical application of machine learning in data-driven population segmentation analysis

被引:4
|
作者
Liu, Pinyan [1 ]
Wang, Ziwen [1 ]
Liu, Nan [1 ,2 ,3 ]
Peres, Marco Aurelio
机构
[1] Duke NUS Med Sch, Ctr Quantitat Med, Singapore, Singapore
[2] Duke NUS Med Sch, Programme Hlth Serv & Syst Res, Singapore, Singapore
[3] Natl Univ Singapore, Inst Data Sci, Singapore, Singapore
关键词
population segmentation; machine learning; data analytics; population health; health services research; HEALTH DATA; SUBGROUPS; PATTERNS; CLUSTERS; CARE; PROFILES; BENEFITS;
D O I
10.1093/jamia/ocad111
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective Data-driven population segmentation is commonly used in clinical settings to separate the heterogeneous population into multiple relatively homogenous groups with similar healthcare features. In recent years, machine learning (ML) based segmentation algorithms have garnered interest for their potential to speed up and improve algorithm development across many phenotypes and healthcare situations. This study evaluates ML-based segmentation with respect to (1) the populations applied, (2) the segmentation details, and (3) the outcome evaluations. Materials and Methods MEDLINE, Embase, Web of Science, and Scopus were used following the PRISMA-ScR criteria. Peer-reviewed studies in the English language that used data-driven population segmentation analysis on structured data from January 2000 to October 2022 were included. Results We identified 6077 articles and included 79 for the final analysis. Data-driven population segmentation analysis was employed in various clinical settings. K-means clustering is the most prevalent unsupervised ML paradigm. The most common settings were healthcare institutions. The most common targeted population was the general population. Discussion Although all the studies did internal validation, only 11 papers (13.9%) did external validation, and 23 papers (29.1%) conducted methods comparison. The existing papers discussed little validating the robustness of ML modeling. Conclusion Existing ML applications on population segmentation need more evaluations regarding giving tailored, efficient integrated healthcare solutions compared to traditional segmentation analysis. Future ML applications in the field should emphasize methods' comparisons and external validation and investigate approaches to evaluate individual consistency using different methods.
引用
收藏
页码:1573 / 1582
页数:10
相关论文
共 50 条
  • [21] A scoping review of supervised learning modelling and data-driven optimisation in monoclonal antibody process development
    Pham, Tien Dung
    Manapragada, Chaitanya
    Sun, Yuan
    Bassett, Robert
    Aickelin, Uwe
    [J]. DIGITAL CHEMICAL ENGINEERING, 2023, 7
  • [22] A data-driven approach of population segmentation in complex frequent admitters
    Ginting, Mimaika Luluina
    Ang, Yan Hoon
    Wong, Chek Hooi
    [J]. INTERNATIONAL JOURNAL OF INTEGRATED CARE, 2022, 22
  • [23] Chinese diabetes datasets for data-driven machine learning
    Qinpei Zhao
    Jinhao Zhu
    Xuan Shen
    Chuwen Lin
    Yinjia Zhang
    Yuxiang Liang
    Baige Cao
    Jiangfeng Li
    Xiang Liu
    Weixiong Rao
    Congrong Wang
    [J]. Scientific Data, 10
  • [24] Machine Learning for Data-Driven Discovery The Rise and Relevance
    Sengupta, Partho P.
    Shrestha, Sirish
    [J]. JACC-CARDIOVASCULAR IMAGING, 2019, 12 (04) : 690 - 692
  • [25] Machine Learning Descriptors for Data-Driven Catalysis Study
    Mou, Li-Hui
    Han, TianTian
    Smith, Pieter E. S.
    Sharman, Edward
    Jiang, Jun
    [J]. ADVANCED SCIENCE, 2023, 10 (22)
  • [26] Unsupervised machine learning for data-driven representations of reactions
    Sirumalla, Sai Krishna
    West, Richard
    [J]. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2018, 256
  • [27] Anomaly analytics in data-driven machine learning applications
    Azimi, Shelernaz
    Pahl, Claus
    [J]. INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2024,
  • [28] Chinese diabetes datasets for data-driven machine learning
    Zhao, Qinpei
    Zhu, Jinhao
    Shen, Xuan
    Lin, Chuwen
    Zhang, Yinjia
    Liang, Yuxiang
    Cao, Baige
    Li, Jiangfeng
    Liu, Xiang
    Rao, Weixiong
    Wang, Congrong
    [J]. SCIENTIFIC DATA, 2023, 10 (01)
  • [29] Data-driven models in machine learning for crime prediction
    Wawrzyniak, Zbigniew M.
    Jankowski, Stanislaw
    Szczechla, Eliza
    Szymanski, Zbigniew
    Pytlak, Radoslaw
    Michalak, Pawel
    Borowik, Grzegorz
    [J]. 2018 26TH INTERNATIONAL CONFERENCE ON SYSTEMS ENGINEERING (ICSENG 2018), 2018,
  • [30] Constructing Dependable Data-Driven Software With Machine Learning
    Pahl, Claus
    Azimi, Shelernaz
    [J]. IEEE SOFTWARE, 2021, 38 (06) : 88 - 97