Unlocking stroke prediction: Harnessing projection-based statistical feature extraction with ML algorithms

被引:0
|
作者
Sahriar, Saad [1 ]
Akther, Sanjida [1 ]
Mauya, Jannatul [1 ]
Amin, Ruhul [1 ]
Mia, Md Shahajada [2 ]
Ruhi, Sabba [2 ]
Reza, Md Shamim [1 ,2 ]
机构
[1] Pabna Univ Sci & Technol, Dept Stat, Deep Stat Learning & Res Lab, Pabna 6600, Bangladesh
[2] Pabna Univ Sci & Technol, Dept Stat, Pabna 6600, Bangladesh
关键词
Stroke; Risk prediction; Machine learning; PCA; FA; Medical diagnosis;
D O I
10.1016/j.heliyon.2024.e27411
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Non-communicable diseases, such as cardiovascular disease, cancer, chronic respiratory diseases, and diabetes, are responsible for approximately 71% of all deaths worldwide. Stroke, a cerebrovascular disorder, is one of the leading contributors to this burden among the top three causes of death. Early recognition of symptoms can encourage a balanced lifestyle and provide essential information for stroke prediction. To identify a stroke patient and risk factors, machine learning (ML) is a key tool for physicians. Due to different data measurement scales and their probability distributional assumptions, ML-based algorithms struggle to detect risk factors. Furthermore, when dealing with risk factors with high-dimensional features, learning algorithms struggle with complexity. In this study, rigorous statistical tests are used to identify risk factors, and PCA-FA (Integration of Principal Components and Factors) and FPCA (Factor Based PCA) approaches are proposed for projecting suitable feature representations for improving learning algorithm performances. The study dataset consists of different clinical, lifestyle, and genetic attributes, allowing for a comprehensive analysis of potential risk factors associated with stroke, which contains 5110 patient records. Using significant test (P-value <0.05), chi-square and independent sample t-test identified age, heart_disease, hypertension, work_type, ever_married, bmi, and smoking_status as risk factors for stroke. To develop the predicting model with proposed feature extraction techniques, random forests approach provides the best results when utilizing the PCAFA method. The best accuracy rate for this approach is 92.55%, while the AUC score is 98.15%. The prediction accuracy has increased from 2.19% to 19.03% compared to the existing work. Additionally, the prediction results is robustified and reproducible with a stacking ensemblebased classification algorithm. We also developed a web-based application to help doctors diagnose stroke risk based on the findings of this study, which could be used as an additional tool to help doctors diagnose.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Random Projection-Based Feature Transformation Using Metaheuristic Optimization Algorithm
    Hamouda, Eslam
    Abohamama, A. S.
    Tarek, Mayada
    [J]. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2021, 46 (09) : 8345 - 8353
  • [22] FEATURE PROJECTION-BASED UNSUPERVISED DOMAIN ADAPTATION FOR ACOUSTIC SCENE CLASSIFICATION
    Mezza, Alessandro Ilic
    Habets, Emanuel A. P.
    Mueller, Meinard
    Sarti, Augusto
    [J]. PROCEEDINGS OF THE 2020 IEEE 30TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2020,
  • [23] Frequency Separation in Projection-Based Force Reflection Algorithms for Bilateral Teleoperators
    Takhmar, Amir
    Polushin, Ilia G.
    Patel, Rajni V.
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2013, : 1492 - 1497
  • [24] Stability of bilateral teleoperators with generalized projection-based force reflection algorithms
    Polushin, Ilia G.
    Liu, Xiaoping P.
    Lung, Chung-Horng
    [J]. AUTOMATICA, 2012, 48 (06) : 1005 - 1016
  • [25] CASTING AND MILLING RESTRICTIONS IN TOPOLOGY OPTIMIZATION VIA PROJECTION-BASED ALGORITHMS
    Guest, James K.
    Zhu, Mu
    [J]. PROCEEDINGS OF THE ASME INTERNATIONAL DESIGN ENGINEERING TECHNICAL CONFERENCES AND COMPUTERS AND INFORMATION IN ENGINEERING CONFERENCE 2012, VOL 3, PTS A AND B, 2012, : 913 - 920
  • [26] Application of orthogonal projection-based algorithms and HLAE in the analysis of grey system
    Qi, Yun-Peng
    Wu, Yu-Tian
    Li, Tong-Hua
    Huang, Jian-Rong
    Wu, Xiang-Feng
    Chai, Yi-Feng
    [J]. Gaodeng Xuexiao Huaxue Xuebao/Chemical Journal of Chinese Universities, 2003, 24 (11):
  • [27] Random Projection-Based Feature Transformation Using Metaheuristic Optimization Algorithm
    Eslam Hamouda
    A. S. Abohamama
    Mayada Tarek
    [J]. Arabian Journal for Science and Engineering, 2021, 46 : 8345 - 8353
  • [28] Distributed Projection-Based Algorithms for Source Localization in Wireless Sensor Networks
    Zhang, Yanqiong
    Lou, Youcheng
    Hong, Yiguang
    Xie, Lihua
    [J]. IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2015, 14 (06) : 3131 - 3142
  • [29] Application of orthogonal projection-based algorithms and HLAE in the analysis of grey system
    Qi, YP
    Wu, YT
    Li, TH
    Huang, JR
    Wu, XF
    Chai, YF
    [J]. CHEMICAL JOURNAL OF CHINESE UNIVERSITIES-CHINESE, 2003, 24 (11): : 1976 - 1979
  • [30] Projection-based rank reduction algorithms for multichannel modelling and image compression
    Dologlou, I
    Pesquet, JC
    Skowronski, J
    [J]. SIGNAL PROCESSING, 1996, 48 (02) : 97 - 109