Stroke Treatment Prediction Using Features Selection Methods and Machine Learning Classifiers

被引:4
|
作者
Chourib, I. [1 ,2 ]
Guillard, G. [3 ]
Farah, I. R. [1 ]
Solaiman, B. [2 ]
机构
[1] Natl Sch Comp Sci, STICODE Dept, RIADI Lab, Manouba, Tunisia
[2] IMT Atlantique, MATHSTIC Dept, ITI Lab, Brest, France
[3] Intradys, Brest, France
关键词
Stroke disease; Feature selection; Data mining; Decision tree classifier; Naive Bayes; K-nearest neighbor; Recursive feature elimination; Tree-based model; Chi-square; CLASSIFICATION;
D O I
10.1016/j.irbm.2022.02.002
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Objectives: Feature selection in data sets is an important task allowing to alleviate various machine learning and data mining issues. The main objectives of a feature selection method consist on building simpler and more understandable classifier models in order to improve the data mining and processing performances. Therefore, a comparative evaluation of the Chi-square method, recursive feature elimination method, and tree-based method (using Random Forest) used on the three common machine learning methods (K-Nearest Neighbor, naive Bayesian classifier and decision tree classifier) are performed to select the most relevant primitives from a large set of attributes. Furthermore, determining the most suitable couple (i.e., feature selection method-machine learning method) that provides the best performance is performed.Materials and methods: In this paper, an overview of the most common feature selection techniques is first provided: the Chi-Square method, the Recursive Feature Elimination method (RFE) and the tree-based method (using Random Forest). A comparative evaluation of the improvement (brought by such feature selection methods) to the three common machine learning methods (K-Nearest Neighbor, naive Bayesian classifier and decision tree classifier) are performed. For evaluation purposes, the following measures: micro-F1, accuracy and root mean square error are used on the stroke disease data set.Results: The obtained results show that the proposed approach (i.e., Tree Based Method using Random Forest, TBM-RF, decision tree classifier, DTC) provides accuracy higher than 85%, F1-score higher than 88%, thus, better than the KNN and NB using the Chi-Square, RFE and TBM-RF methods.Conclusion: This study shows that the couple -Tree Based Method using Random Forest (TBM-RF) decision tree classifier successfully and efficiently contributes to find the most relevant features and to predict and classify patient suffering of stroke disease."(c) 2022 AGBM. Published by Elsevier Masson SAS. All rights reserved.
引用
收藏
页码:678 / 686
页数:9
相关论文
共 50 条
  • [21] Enhancing Parkinson's Disease Prediction Using Machine Learning and Feature Selection Methods
    Saeed, Faisal
    Al-Sarem, Mohammad
    Al-Mohaimeed, Muhannad
    Emara, Abdelhamid
    Boulila, Wadii
    Alasli, Mohammed
    Ghabban, Fahad
    CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 71 (03): : 5639 - 5657
  • [22] Prediction of core cancer genes using a hybrid of feature selection and machine learning methods
    Liu, Y. X.
    Zhang, N. N.
    He, Y.
    Lun, L. J.
    GENETICS AND MOLECULAR RESEARCH, 2015, 14 (03): : 8871 - 8882
  • [23] Prediction of The Effect of Demographic Features on Online Market Shopping Using Machine Learning Methods
    Bahcivan, Burak
    Yilmaz, Atinc
    KONYA JOURNAL OF ENGINEERING SCIENCES, 2023, 11 (04):
  • [24] Transmembrane region prediction by using sequence-derived features and machine learning methods
    Yan, Renxiang
    Wang, Xiaofeng
    Huang, Lanqing
    Tian, Yarong
    Cai, Weiwen
    RSC ADVANCES, 2017, 7 (46) : 29200 - 29211
  • [25] Breast cancer prediction with transcriptome profiling using feature selection and machine learning methods
    Eskandar Taghizadeh
    Sahel Heydarheydari
    Alihossein Saberi
    Shabnam JafarpoorNesheli
    Seyed Masoud Rezaeijo
    BMC Bioinformatics, 23
  • [26] Breast cancer prediction with transcriptome profiling using feature selection and machine learning methods
    Taghizadeh, Eskandar
    Heydarheydari, Sahel
    Saberi, Alihossein
    JafarpoorNesheli, Shabnam
    Rezaeijo, Seyed Masoud
    BMC BIOINFORMATICS, 2022, 23 (01)
  • [27] Tweet Sentiment Classification Using an Ensemble of Machine Learning Supervised Classifiers Employing Statistical Feature Selection Methods
    Devi, K. Lakshmi
    Subathra, P.
    Kumar, P. N.
    PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON FUZZY AND NEURO COMPUTING (FANCCO - 2015), 2015, 415 : 1 - 13
  • [28] Clinical Outcome Prediction Pipeline for Ischemic Stroke Patients Using Radiomics Features and Machine Learning
    Erdogan, Meryem Sahin
    Sumer, Esra
    Villagra, Federico
    Isik, Esin Ozturk
    Akanyeti, Otar
    Saybasili, Hale
    ADVANCES IN COMPUTATIONAL INTELLIGENCE SYSTEMS, UKCI 2023, 2024, 1453 : 504 - 515
  • [29] Comparison of Feature Selection Methods and Machine Learning Classifiers for Radiomics Analysis in Glioma Grading
    Sun, Pan
    Wang, Defeng
    Mok, Vincent C. T.
    Shi, Lin
    IEEE ACCESS, 2019, 7 : 102010 - 102020
  • [30] Heart Disease Prediction System Using Model Of Machine Learning and Sequential Backward Selection Algorithm for Features Selection
    Ul Haq, Amin
    Li, Jianping
    Memon, Muhammad Hammad
    Memon, Muhammad Hunain
    Khan, Jalaluddin
    Marium, Syeda Munana
    2019 IEEE 5TH INTERNATIONAL CONFERENCE FOR CONVERGENCE IN TECHNOLOGY (I2CT), 2019,