Stroke Treatment Prediction Using Features Selection Methods and Machine Learning Classifiers

被引:4
|
作者
Chourib, I. [1 ,2 ]
Guillard, G. [3 ]
Farah, I. R. [1 ]
Solaiman, B. [2 ]
机构
[1] Natl Sch Comp Sci, STICODE Dept, RIADI Lab, Manouba, Tunisia
[2] IMT Atlantique, MATHSTIC Dept, ITI Lab, Brest, France
[3] Intradys, Brest, France
关键词
Stroke disease; Feature selection; Data mining; Decision tree classifier; Naive Bayes; K-nearest neighbor; Recursive feature elimination; Tree-based model; Chi-square; CLASSIFICATION;
D O I
10.1016/j.irbm.2022.02.002
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Objectives: Feature selection in data sets is an important task allowing to alleviate various machine learning and data mining issues. The main objectives of a feature selection method consist on building simpler and more understandable classifier models in order to improve the data mining and processing performances. Therefore, a comparative evaluation of the Chi-square method, recursive feature elimination method, and tree-based method (using Random Forest) used on the three common machine learning methods (K-Nearest Neighbor, naive Bayesian classifier and decision tree classifier) are performed to select the most relevant primitives from a large set of attributes. Furthermore, determining the most suitable couple (i.e., feature selection method-machine learning method) that provides the best performance is performed.Materials and methods: In this paper, an overview of the most common feature selection techniques is first provided: the Chi-Square method, the Recursive Feature Elimination method (RFE) and the tree-based method (using Random Forest). A comparative evaluation of the improvement (brought by such feature selection methods) to the three common machine learning methods (K-Nearest Neighbor, naive Bayesian classifier and decision tree classifier) are performed. For evaluation purposes, the following measures: micro-F1, accuracy and root mean square error are used on the stroke disease data set.Results: The obtained results show that the proposed approach (i.e., Tree Based Method using Random Forest, TBM-RF, decision tree classifier, DTC) provides accuracy higher than 85%, F1-score higher than 88%, thus, better than the KNN and NB using the Chi-Square, RFE and TBM-RF methods.Conclusion: This study shows that the couple -Tree Based Method using Random Forest (TBM-RF) decision tree classifier successfully and efficiently contributes to find the most relevant features and to predict and classify patient suffering of stroke disease."(c) 2022 AGBM. Published by Elsevier Masson SAS. All rights reserved.
引用
收藏
页码:678 / 686
页数:9
相关论文
共 50 条
  • [41] Comparison of Feature Selection Methods and Machine Learning Classifiers with CT Radiomics-Based Features for Predicting Chronic Obstructive Pulmonary Disease
    Makimoto, K.
    Au, R. C.
    Moslemi, A.
    Hogg, J. C.
    Bourbeau, J.
    Tan, W. C.
    Kirby, M.
    AMERICAN JOURNAL OF RESPIRATORY AND CRITICAL CARE MEDICINE, 2022, 205
  • [42] Heart Diseases Prediction for Optimization based Feature Selection and Classification using Machine Learning Methods
    Rajinikanth, N.
    Pavithra, L.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (02) : 636 - 643
  • [43] Prediction of blood supply in vestibular schwannomas using radiomics machine learning classifiers
    Dixiang Song
    Yixuan Zhai
    Xiaogang Tao
    Chao Zhao
    Minkai Wang
    Xinting Wei
    Scientific Reports, 11
  • [44] Selection and combination of machine learning classifiers for prediction of linear B-cell epitopes on proteins
    Söllner, J
    JOURNAL OF MOLECULAR RECOGNITION, 2006, 19 (03) : 209 - 214
  • [45] Stock market prediction using machine learning classifiers and social media, news
    Wasiat Khan
    Mustansar Ali Ghazanfar
    Muhammad Awais Azam
    Amin Karami
    Khaled H. Alyoubi
    Ahmed S. Alfakeeh
    Journal of Ambient Intelligence and Humanized Computing, 2022, 13 : 3433 - 3456
  • [46] Stock market prediction using machine learning classifiers and social media, news
    Khan, Wasiat
    Ghazanfar, Mustansar Ali
    Azam, Muhammad Awais
    Karami, Amin
    Alyoubi, Khaled H.
    Alfakeeh, Ahmed S.
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2020, 13 (7) : 3433 - 3456
  • [47] Prediction of survival and metastasis in breast cancer patients using machine learning classifiers
    Tapak, Leili
    Shirmohammadi-Khorram, Nasrin
    Amini, Payam
    Alafchi, Behnaz
    Hamidi, Omid
    Poorolajal, Jalal
    CLINICAL EPIDEMIOLOGY AND GLOBAL HEALTH, 2019, 7 (03): : 293 - 299
  • [48] Heart Disease Risk Prediction Using Machine Learning Classifiers with Attribute Evaluators
    Reddy, Karna Vishnu Vardhana
    Elamvazuthi, Irraivan
    Abd Aziz, Azrina
    Paramasivam, Sivajothi
    Chua, Hui Na
    Pranavanand, S.
    APPLIED SCIENCES-BASEL, 2021, 11 (18):
  • [49] Prediction of heart disease by classifying with feature selection and machine learning methods
    Gazeloglu, Cengiz
    PROGRESS IN NUTRITION, 2020, 22 (02): : 660 - 670
  • [50] Prediction of plant lncRNA by ensemble machine learning classifiers
    Caitlin M. A. Simopoulos
    Elizabeth A. Weretilnyk
    G. Brian Golding
    BMC Genomics, 19