Predicting stroke occurrences: a stacked machine learning approach with feature selection and data preprocessing

被引:1
|
作者
Chakraborty, Pritam [1 ]
Bandyopadhyay, Anjan [1 ]
Sahu, Preeti Padma [1 ]
Burman, Aniket [1 ]
Mallik, Saurav [2 ]
Alsubaie, Najah [3 ]
Abbas, Mohamed [4 ]
Alqahtani, Mohammed S. [5 ,6 ]
Soufiene, Ben Othman [7 ]
机构
[1] KIIT Univ, Sch Comp Engn, Bhubaneswar 751024, Odisha, India
[2] Harvard TH Chan Sch Publ Hlth, Dept Environm Hlth, 677 Huntington Ave, Boston, MA 02115 USA
[3] Princess Nourah Bint Abdulrahman Univ, Coll Comp & Informat Sci, Dept Comp Sci, POB 84428, Riyadh 11671, Saudi Arabia
[4] King Khalid Univ, Coll Engn, Elect Engn Dept, Abha 61421, Saudi Arabia
[5] King Khalid Univ, Coll Appl Med Sci, Radiol Sci Dept, Abha 61421, Saudi Arabia
[6] Univ Leicester, Space Res Ctr, BioImaging Unit, Michael Atiyah Bldg, Leicester LE1 7RH, England
[7] Univ Sousse, PRINCE Lab Res, ISITcom, Sousse, Tunisia
来源
BMC BIOINFORMATICS | 2024年 / 25卷 / 01期
关键词
Stroke prediction; Machine learning; Principal component analysis (PCA); Stacking ensemble; Healthcare analytics; Predictive modeling; Class imbalance; Feature selection; Early intervention;
D O I
10.1186/s12859-024-05866-8
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Stroke prediction remains a critical area of research in healthcare, aiming to enhance early intervention and patient care strategies. This study investigates the efficacy of machine learning techniques, particularly principal component analysis (PCA) and a stacking ensemble method, for predicting stroke occurrences based on demographic, clinical, and lifestyle factors. We systematically varied PCA components and implemented a stacking model comprising random forest, decision tree, and K-nearest neighbors (KNN).Our findings demonstrate that setting PCA components to 16 optimally enhanced predictive accuracy, achieving a remarkable 98.6% accuracy in stroke prediction. Evaluation metrics underscored the robustness of our approach in handling class imbalance and improving model performance, also comparative analyses against traditional machine learning algorithms such as SVM, logistic regression, and Naive Bayes highlighted the superiority of our proposed method.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] Data preprocessing and feature selection techniques in gait recognition: A comparative study of machine learning and deep learning approaches
    Parashar, Anubha
    Parashar, Apoorva
    Ding, Weiping
    Shabaz, Mohammad
    Rida, Imad
    PATTERN RECOGNITION LETTERS, 2023, 172 : 65 - 73
  • [2] Data Cleansing Meets Feature Selection: A Supervised Machine Learning Approach
    Tallon-Ballesteros, Antonio J.
    Riquelme, Jose C.
    BIOINSPIRED COMPUTATION IN ARTIFICIAL SYSTEMS, PT II, 2015, 9108 : 369 - 378
  • [3] Data Classification Using Feature Selection And kNN Machine Learning Approach
    Begum, Shemim
    Chakraborty, Debasis
    Sarkar, Ram
    2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2015, : 811 - 814
  • [4] Feature Selection of Photoplethysmograph Data in Machine Learning
    Haq, Faris Atoil
    Sarno, Riyanarto
    Abdillah, Rifqi
    Amri, Taufiq Choirul
    Septiyanto, Abdullah Faqih
    Sungkono, Kelly Rossa
    2023 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION, ICAIIC, 2023, : 315 - 320
  • [5] Impacts of Feature Selection on Predicting Machine Failures by Machine Learning Algorithms
    Bezerra, Francisco Elanio
    de Oliveira Neto, Geraldo Cardoso
    Cervi, Gabriel Magalhaes
    Mazetto, Rafaella Francesconi
    de Faria, Aline Mariane
    Vido, Marcos
    Lima, Gustavo Araujo
    de Araujo, Sidnei Alves
    Sampaio, Mauro
    Amorim, Marlene
    APPLIED SCIENCES-BASEL, 2024, 14 (08):
  • [6] A Scalable Feature Selection and Model Updating Approach for Big Data Machine Learning
    Yang, Baijian
    Zhang, Tonglin
    2016 IEEE INTERNATIONAL CONFERENCE ON SMART CLOUD (SMARTCLOUD), 2016, : 146 - 151
  • [7] An efficient feature subset selection approach for machine learning
    Rincy, N. Thomas
    Gupta, Roopam
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (08) : 12737 - 12830
  • [8] An efficient feature subset selection approach for machine learning
    Thomas Rincy N
    Roopam Gupta
    Multimedia Tools and Applications, 2021, 80 : 12737 - 12830
  • [9] Quantum Optimization Approach for Feature Selection in Machine Learning
    Fleury, Gerard
    Vulpescu, Bogdan
    Lacomme, Philippe
    METAHEURISTICS, MIC 2024, PT I, 2024, 14753 : 281 - 288
  • [10] An Explainable Feature Selection Approach for Fair Machine Learning
    Yang, Zhi
    Wang, Ziming
    Huang, Changwu
    Yao, Xin
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VIII, 2023, 14261 : 75 - 86