Unlocking stroke prediction: Harnessing projection-based statistical feature extraction with ML algorithms

被引:0
|
作者
Sahriar, Saad [1 ]
Akther, Sanjida [1 ]
Mauya, Jannatul [1 ]
Amin, Ruhul [1 ]
Mia, Md Shahajada [2 ]
Ruhi, Sabba [2 ]
Reza, Md Shamim [1 ,2 ]
机构
[1] Pabna Univ Sci & Technol, Dept Stat, Deep Stat Learning & Res Lab, Pabna 6600, Bangladesh
[2] Pabna Univ Sci & Technol, Dept Stat, Pabna 6600, Bangladesh
关键词
Stroke; Risk prediction; Machine learning; PCA; FA; Medical diagnosis;
D O I
10.1016/j.heliyon.2024.e27411
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Non-communicable diseases, such as cardiovascular disease, cancer, chronic respiratory diseases, and diabetes, are responsible for approximately 71% of all deaths worldwide. Stroke, a cerebrovascular disorder, is one of the leading contributors to this burden among the top three causes of death. Early recognition of symptoms can encourage a balanced lifestyle and provide essential information for stroke prediction. To identify a stroke patient and risk factors, machine learning (ML) is a key tool for physicians. Due to different data measurement scales and their probability distributional assumptions, ML-based algorithms struggle to detect risk factors. Furthermore, when dealing with risk factors with high-dimensional features, learning algorithms struggle with complexity. In this study, rigorous statistical tests are used to identify risk factors, and PCA-FA (Integration of Principal Components and Factors) and FPCA (Factor Based PCA) approaches are proposed for projecting suitable feature representations for improving learning algorithm performances. The study dataset consists of different clinical, lifestyle, and genetic attributes, allowing for a comprehensive analysis of potential risk factors associated with stroke, which contains 5110 patient records. Using significant test (P-value <0.05), chi-square and independent sample t-test identified age, heart_disease, hypertension, work_type, ever_married, bmi, and smoking_status as risk factors for stroke. To develop the predicting model with proposed feature extraction techniques, random forests approach provides the best results when utilizing the PCAFA method. The best accuracy rate for this approach is 92.55%, while the AUC score is 98.15%. The prediction accuracy has increased from 2.19% to 19.03% compared to the existing work. Additionally, the prediction results is robustified and reproducible with a stacking ensemblebased classification algorithm. We also developed a web-based application to help doctors diagnose stroke risk based on the findings of this study, which could be used as an additional tool to help doctors diagnose.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] PROJECTION-BASED GEOMETRICAL FEATURE-EXTRACTION FOR COMPUTER VISION - ALGORITHMS IN PIPELINE ARCHITECTURES
    SANZ, JLC
    DINSTEIN, I
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1987, 9 (01) : 160 - 168
  • [2] Random projection-based partial feature extraction for robust face recognition
    Ma, Chunfei
    Jung, June-Young
    Kim, Seung-Wook
    Ko, Sung-Jea
    [J]. NEUROCOMPUTING, 2015, 149 : 1232 - 1244
  • [3] Classifying Craniosynostosis with a 3D Projection-Based Feature Extraction System
    Lam, Irma
    Cunningham, Michael
    Speltz, Matthew
    Shapiro, Linda
    [J]. 2014 IEEE 27TH INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS (CBMS), 2014, : 215 - 220
  • [4] Projection-based measure for efficient feature selection
    Ruiz, R
    Riquelme, JC
    Aguilar-Ruiz, JS
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2002, 12 (3-4) : 175 - 183
  • [5] Comparison of projection-based face recognition algorithms
    Phillips, PJ
    Moon, H
    [J]. SMC '97 CONFERENCE PROCEEDINGS - 1997 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-5: CONFERENCE THEME: COMPUTATIONAL CYBERNETICS AND SIMULATION, 1997, : 4057 - 4062
  • [6] Projection-based link prediction in a bipartite network
    Gao, Man
    Chen, Ling
    Bin Li
    Li, Yun
    Liu, Wei
    Xu, Yong-Cheng
    [J]. INFORMATION SCIENCES, 2017, 376 : 158 - 171
  • [7] Orthogonal projection based statistical feature extraction for continuous process monitoring
    Ji, Cheng
    Ma, Fangyuan
    Wang, Jingde
    [J]. COMPUTERS & CHEMICAL ENGINEERING, 2024, 183
  • [8] Projection-based LMS and MMSE algorithms for adaptive antennas
    Tuan, LM
    Park, JD
    Yoon, GW
    Kim, JW
    [J]. 2001 INTERNATIONAL CONFERENCES ON INFO-TECH AND INFO-NET PROCEEDINGS, CONFERENCE A-G: INFO-TECH & INFO-NET: A KEY TO BETTER LIFE, 2001, : B429 - B432
  • [9] On projection-based algorithms for model-order reduction of interconnects
    Wang, JML
    Chu, CC
    Yu, QJ
    Kuh, ES
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2002, 49 (11) : 1563 - 1585
  • [10] ML-Based Stroke Detection Model Using Different Feature Selection Algorithms
    Abdel-Jaber, Hussein
    Abdel-Wahab, Ahmed
    Hadi, Anas Abdualqader
    Atitallah, Nesrine
    Mohamed, Ali Wagdy
    [J]. Informatica (Slovenia), 2024, 48 (17): : 77 - 94