A cluster-based ensemble approach for congenital heart disease prediction

被引:6
|
作者
Kaur, Ishleen [1 ]
Ahmad, Tanvir [2 ]
机构
[1] Univ Delhi, Sri Guru Tegh Bahadur Khalsa Coll, Delhi, India
[2] Jamia Millia Islamia, Dept Comp Engn, New Delhi, India
关键词
Congenital heart disease; DBSCAN; Ensemble; Machine learning; Random forest; DIAGNOSIS; DEFECTS; TRENDS;
D O I
10.1016/j.cmpb.2023.107922
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Background: One of the most prevalent birth disorders is congenital heart diseases (CHD). Although CHD risk factors have been the subject of numerous studies, their propensity to cause CHD has not been tested. Particularly few research has attempted to forecast CHD risk using population-based cross-sectional data, which is inherently imbalanced. Objective: The main goals of this study are to create a reliable data analysis model that can help with (i) a better understanding of congenital heart disease prediction in the presence of missing and unbalanced data and (ii) creating cohorts of expectant mothers with similar lifestyle characteristics. Methods: Clusters of patient cohorts are produced using the unsupervised data mining technique density-based spatial clustering of applications with noise (DBSCAN). For more accurate CHD prediction, a random forest model was trained using these clusters and their corresponding patterns. This study uses a dataset of 33,831 expectant mothers to make its prediction. Missing data were handled using the k-NN imputation approach, while extremely unbalanced data were balanced using SMOTE. These techniques are all data-driven and need little to no user or expert involvement. Results and Conclusion: Using DBSCAN, three cohorts were found. The cluster information enhanced the random forest-based CHD prediction and revealed intricate factors that influence prediction accuracy. The proposed approach gave the highest results with 99 % accuracy and 0.91 AUC and performed better than the state-of-theart methodologies. Hence, the suggested method using unsupervised learning can provide intricate information to the classifier and further enhance the performance of the classification.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] Cluster-based ensemble of classifiers
    Rahman, Ashfaqur
    Verma, Brijesh
    EXPERT SYSTEMS, 2013, 30 (03) : 270 - 282
  • [2] A Cluster-Based Classifier Ensemble as an Alternative to the Nearest Neighbor Ensemble
    Jurek, Anna
    Bi, Yaxin
    Wu, Shengli
    Nugent, Chris
    2012 IEEE 24TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2012), VOL 1, 2012, : 1100 - 1105
  • [3] A Cluster-Based Semisupervised Ensemble for Multiclass Classification
    Soares, Rodrigo G. F.
    Chen, Huanhuan
    Yao, Xin
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2017, 1 (06): : 408 - 420
  • [4] Prediction of Heart Disease using an Ensemble Learning Approach
    Alshehri G.A.
    Alharbi H.M.
    International Journal of Advanced Computer Science and Applications, 2023, 14 (08) : 1089 - 1097
  • [5] Cluster-based local modeling approach to protein secondary structure prediction
    Doong, SH
    Yeh, CY
    JOURNAL OF COMPUTATIONAL AND THEORETICAL NANOSCIENCE, 2005, 2 (04) : 551 - 560
  • [6] An innovative cluster-based prediction approach for mass solar site management
    Wang, Jui-Tang
    Nguyen, Thi Anh Tuyet
    Guo, Yu-Hong
    Hsu, Chau-Yun
    Xie, Huang-Jun
    ENERGY & ENVIRONMENT, 2023,
  • [7] Cluster-Based Approach for Cellular Traffic Prediction with Machine Learning Methods
    Correia, Daniel
    Pinto, Filipe Cabral
    Sargento, Susana
    Georgieva, Petia
    2024 IEEE 22ND MEDITERRANEAN ELECTROTECHNICAL CONFERENCE, MELECON 2024, 2024, : 514 - 519
  • [8] An innovative cluster-based prediction approach for mass solar site management
    Wang, Jui-Tang
    Thi Anh Tuyet Nguyen
    Guo, Yu-Hong
    Hsu, Chau-Yun
    Xie, Huang-Jun
    ENERGY & ENVIRONMENT, 2025, 36 (01) : 212 - 230
  • [9] Manifold cluster-based evolutionary ensemble imbalance learning
    Guo, Yinan
    Feng, Jiawei
    Jiao, Botao
    Yang, Linkai
    Lu, Hui
    Yu, Zekuan
    COMPUTERS & INDUSTRIAL ENGINEERING, 2021, 159
  • [10] Heart Disease Prediction Using a Stacked Ensemble Learning Approach
    Shrawan Kumar
    Bharti Thakur
    SN Computer Science, 6 (1)