A cluster-based ensemble approach for congenital heart disease prediction

被引:6
|
作者
Kaur, Ishleen [1 ]
Ahmad, Tanvir [2 ]
机构
[1] Univ Delhi, Sri Guru Tegh Bahadur Khalsa Coll, Delhi, India
[2] Jamia Millia Islamia, Dept Comp Engn, New Delhi, India
关键词
Congenital heart disease; DBSCAN; Ensemble; Machine learning; Random forest; DIAGNOSIS; DEFECTS; TRENDS;
D O I
10.1016/j.cmpb.2023.107922
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Background: One of the most prevalent birth disorders is congenital heart diseases (CHD). Although CHD risk factors have been the subject of numerous studies, their propensity to cause CHD has not been tested. Particularly few research has attempted to forecast CHD risk using population-based cross-sectional data, which is inherently imbalanced. Objective: The main goals of this study are to create a reliable data analysis model that can help with (i) a better understanding of congenital heart disease prediction in the presence of missing and unbalanced data and (ii) creating cohorts of expectant mothers with similar lifestyle characteristics. Methods: Clusters of patient cohorts are produced using the unsupervised data mining technique density-based spatial clustering of applications with noise (DBSCAN). For more accurate CHD prediction, a random forest model was trained using these clusters and their corresponding patterns. This study uses a dataset of 33,831 expectant mothers to make its prediction. Missing data were handled using the k-NN imputation approach, while extremely unbalanced data were balanced using SMOTE. These techniques are all data-driven and need little to no user or expert involvement. Results and Conclusion: Using DBSCAN, three cohorts were found. The cluster information enhanced the random forest-based CHD prediction and revealed intricate factors that influence prediction accuracy. The proposed approach gave the highest results with 99 % accuracy and 0.91 AUC and performed better than the state-of-theart methodologies. Hence, the suggested method using unsupervised learning can provide intricate information to the classifier and further enhance the performance of the classification.
引用
收藏
页数:7
相关论文
共 50 条
  • [41] An incremental cluster-based approach to spam filtering
    Hsiao, Wen-Feng
    Chang, Te-Min
    EXPERT SYSTEMS WITH APPLICATIONS, 2008, 34 (03) : 1599 - 1608
  • [42] Cluster-based content caching driven by popularity prediction
    Jia, Bosen
    Li, Ruibin
    Wang, Chenyang
    Qiu, Chao
    Wang, Xiaofei
    CCF TRANSACTIONS ON HIGH PERFORMANCE COMPUTING, 2022, 4 (03) : 357 - 366
  • [43] Cluster-based content caching driven by popularity prediction
    Bosen Jia
    Ruibin Li
    Chenyang Wang
    Chao Qiu
    Xiaofei Wang
    CCF Transactions on High Performance Computing, 2022, 4 : 357 - 366
  • [44] Robust Cluster-then-label (RCTL) Approach for Heart Disease Prediction
    Bodapati J.D.
    Krishna Sajja V.R.
    Mundukur N.B.
    Veeranjaneyulu N.
    Ingenierie des Systemes d'Information, 2019, 24 (03): : 255 - 260
  • [45] Cluster-Based Destination Prediction in Bike Sharing System
    Dai, Pengcheng
    Song, Changxiong
    Lin, Huiping
    Jia, Pei
    Xu, Zhipeng
    PROCEEDINGS OF 2018 ARTIFICIAL INTELLIGENCE AND CLOUD COMPUTING CONFERENCE (AICCC 2018), 2018, : 1 - 8
  • [46] Structural Ensemble Regression for Cluster-Based Aggregate Electricity Demand Forecasting
    Kontogiannis, Dimitrios
    Bargiotas, Dimitrios
    Daskalopulu, Aspassia
    Arvanitidis, Athanasios Ioannis
    Tsoukalas, Lefteri H.
    ELECTRICITY, 2022, 3 (04): : 480 - 504
  • [47] Optimized Ensemble Learning Approach with Explainable AI for Improved Heart Disease Prediction
    Mienye, Ibomoiye Domor
    Jere, Nobert
    INFORMATION, 2024, 15 (07)
  • [48] Cluster-based deep ensemble learning for emotion classification in Internet memes
    Guo, Xiaoyu
    Ma, Jing
    Zubiaga, Arkaitz
    JOURNAL OF INFORMATION SCIENCE, 2025, 51 (01) : 265 - 283
  • [49] Prediction of Heart Failure in Children with Congenital Heart Disease Based on Multichannel LSTM
    Bai, Jing
    Fu, Juan
    Du, Shihua
    Chen, Xilong
    Zhang, Chengzu
    MOBILE INFORMATION SYSTEMS, 2022, 2022
  • [50] Heart Disease Prediction Using Ensemble Model
    Adhikari, Bikal
    Shakya, Subarna
    PROCEEDINGS OF SECOND INTERNATIONAL CONFERENCE ON SUSTAINABLE EXPERT SYSTEMS (ICSES 2021), 2022, 351 : 857 - 868