Predicting stroke occurrences: a stacked machine learning approach with feature selection and data preprocessing

被引:1
|
作者
Chakraborty, Pritam [1 ]
Bandyopadhyay, Anjan [1 ]
Sahu, Preeti Padma [1 ]
Burman, Aniket [1 ]
Mallik, Saurav [2 ]
Alsubaie, Najah [3 ]
Abbas, Mohamed [4 ]
Alqahtani, Mohammed S. [5 ,6 ]
Soufiene, Ben Othman [7 ]
机构
[1] KIIT Univ, Sch Comp Engn, Bhubaneswar 751024, Odisha, India
[2] Harvard TH Chan Sch Publ Hlth, Dept Environm Hlth, 677 Huntington Ave, Boston, MA 02115 USA
[3] Princess Nourah Bint Abdulrahman Univ, Coll Comp & Informat Sci, Dept Comp Sci, POB 84428, Riyadh 11671, Saudi Arabia
[4] King Khalid Univ, Coll Engn, Elect Engn Dept, Abha 61421, Saudi Arabia
[5] King Khalid Univ, Coll Appl Med Sci, Radiol Sci Dept, Abha 61421, Saudi Arabia
[6] Univ Leicester, Space Res Ctr, BioImaging Unit, Michael Atiyah Bldg, Leicester LE1 7RH, England
[7] Univ Sousse, PRINCE Lab Res, ISITcom, Sousse, Tunisia
来源
BMC BIOINFORMATICS | 2024年 / 25卷 / 01期
关键词
Stroke prediction; Machine learning; Principal component analysis (PCA); Stacking ensemble; Healthcare analytics; Predictive modeling; Class imbalance; Feature selection; Early intervention;
D O I
10.1186/s12859-024-05866-8
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Stroke prediction remains a critical area of research in healthcare, aiming to enhance early intervention and patient care strategies. This study investigates the efficacy of machine learning techniques, particularly principal component analysis (PCA) and a stacking ensemble method, for predicting stroke occurrences based on demographic, clinical, and lifestyle factors. We systematically varied PCA components and implemented a stacking model comprising random forest, decision tree, and K-nearest neighbors (KNN).Our findings demonstrate that setting PCA components to 16 optimally enhanced predictive accuracy, achieving a remarkable 98.6% accuracy in stroke prediction. Evaluation metrics underscored the robustness of our approach in handling class imbalance and improving model performance, also comparative analyses against traditional machine learning algorithms such as SVM, logistic regression, and Naive Bayes highlighted the superiority of our proposed method.
引用
收藏
页数:23
相关论文
共 50 条
  • [21] Predicting cardiovascular disease by combining optimal feature selection methods with machine learning
    Rodriguez Segura, Mauricio
    Nicolis, Orietta
    Peralta Marquez, Billy
    Carrillo Azocar, Juan
    2020 39TH INTERNATIONAL CONFERENCE OF THE CHILEAN COMPUTER SCIENCE SOCIETY (SCCC), 2020,
  • [22] Machine Learning Feature Selection for Predicting High Concentration Therapeutic Antibody Aggregation
    Lai, Pin-Kuang
    Fernando, Amendra
    Cloutier, Theresa K.
    Kingsbury, Jonathan S.
    Gokarn, Yatin
    Halloran, Kevin T.
    Calero-Rubio, Cesar
    Trout, Bernhardt L.
    JOURNAL OF PHARMACEUTICAL SCIENCES, 2021, 110 (04) : 1583 - 1591
  • [23] A photovoltaic power prediction approach enhanced by feature engineering and stacked machine learning model
    Abdelmoula, Ibtihal Ait
    Elhamaoui, Said
    Elalani, Omaima
    Ghennioui, Abdellatif
    El Aroussi, Mohamed
    ENERGY REPORTS, 2022, 8 : 1288 - 1300
  • [24] Stacked Machine Learning Model for Predicting Alzheimer's Disease Based on Genetic Data
    Alatrany, Abbas Saad
    Hussain, Abir
    Jamila, Mustafina
    Al-Jumeiy, Dhiya
    2021 14TH INTERNATIONAL CONFERENCE ON DEVELOPMENTS IN ESYSTEMS ENGINEERING (DESE), 2021, : 594 - 598
  • [25] Stacked machine learning approach for predicting evolved hydrogen from sugar industry wastewater
    Bakir, Rezan
    Orak, Ceren
    INTERNATIONAL JOURNAL OF HYDROGEN ENERGY, 2024, 85 : 75 - 87
  • [26] A Machine Learning Approach to Mass Spectra Classification with Unsupervised Feature Selection
    Ceccarelli, Michele
    d'Acierno, Antonio
    Facchiano, Angelo
    COMPUTATIONAL INTELLIGENCE METHODS FOR BIOINFORMATICS AND BIOSTATISTICS, 2009, 5488 : 242 - +
  • [27] A machine learning approach for predicting suicidal ideation in post stroke patients
    Song, Seung Il
    Hong, Hyeon Taek
    Lee, Changwoo
    Lee, Seung Bo
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [28] A machine learning approach for predicting suicidal ideation in post stroke patients
    Seung Il Song
    Hyeon Taek Hong
    Changwoo Lee
    Seung Bo Lee
    Scientific Reports, 12
  • [29] Scalable Machine Learning with Granulated Data Summaries: A Case of Feature Selection
    Chadzynska-Krasowska, Agnieszka
    Betlinski, PaweL
    Slezak, Dominik
    FOUNDATIONS OF INTELLIGENT SYSTEMS, ISMIS 2017, 2017, 10352 : 519 - 529
  • [30] FEATURE SELECTION FOR KICK DETECTION WITH MACHINE LEARNING USING LABORATORY DATA
    Geekiyanage, Suranga C. H.
    Ambrus, Adrian
    Sui, Dan
    PROCEEDINGS OF THE ASME 38TH INTERNATIONAL CONFERENCE ON OCEAN, OFFSHORE AND ARCTIC ENGINEERING, 2019, VOL 8, 2019,