Comparative analysis of feature selection techniques for COVID-19 dataset

被引:2
|
作者
Mohtasham, Farideh [1 ]
Pourhoseingholi, MohamadAmin [2 ]
Nazari, Seyed Saeed Hashemi [3 ]
Kavousi, Kaveh [4 ]
Zali, Mohammad Reza [1 ]
机构
[1] Shahid Beheshti Univ Med Sci, Res Inst Gastroenterol & Liver Dis, Gastroenterol & Liver Dis Res Ctr, Tehran, Iran
[2] Univ Nottingham, Natl Inst Hlth & Care Res NIHR Nottingham Biomed R, Hearing Sci Mental Hlth & Clin Neurosci, Sch Med, Nottingham, England
[3] Shahid Beheshti Univ Med Sci SBMU, Dept Epidemiol, Sch Publ Hlth & Safety, Tehran, Iran
[4] Univ Tehran, Inst Biochem & Biophys IBB, Dept Bioinformat, Lab Complex Biol Syst & Bioinformat CBB, Tehran, Iran
来源
SCIENTIFIC REPORTS | 2024年 / 14卷 / 01期
关键词
MODELS;
D O I
10.1038/s41598-024-69209-6
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
In the context of early disease detection, machine learning (ML) has emerged as a vital tool. Feature selection (FS) algorithms play a crucial role in ensuring the accuracy of predictive models by identifying the most influential variables. This study, focusing on a retrospective cohort of 4778 COVID-19 patients from Iran, explores the performance of various FS methods, including filter, embedded, and hybrid approaches, in predicting mortality outcomes. The researchers leveraged 115 routine clinical, laboratory, and demographic features and employed 13 ML models to assess the effectiveness of these FS methods based on classification accuracy, predictive accuracy, and statistical tests. The results indicate that a Hybrid Boruta-VI model combined with the Random Forest algorithm demonstrated superior performance, achieving an accuracy of 0.89, an F1 score of 0.76, and an AUC value of 0.95 on test data. Key variables identified as important predictors of adverse outcomes include age, oxygen saturation levels, albumin levels, neutrophil counts, platelet levels, and markers of kidney function. These findings highlight the potential of advanced FS techniques and ML models in enhancing early disease detection and informing clinical decision-making.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] A novel firefly algorithm approach for efficient feature selection with COVID-19 dataset
    Bacanin, Nebojsa
    Venkatachalam, K.
    Bezdan, Timea
    Zivkovic, Miodrag
    Abouhawwash, Mohamed
    MICROPROCESSORS AND MICROSYSTEMS, 2023, 98
  • [2] Feature Selection by Hybrid Binary Ant Lion Optimizer with COVID-19 dataset
    Strumberger, Ivana
    Rakic, Andjela
    Stanojlovic, Stefan
    Arandjelovic, Jelena
    Bezdan, Timea
    Zivkovic, Miodrag
    Bacanin, Nebojsa
    2021 29TH TELECOMMUNICATIONS FORUM (TELFOR), 2021,
  • [3] An Analysis of Feature Selection Techniques For COVID-19 Detection on Chest X-Ray Data
    Selleti, Andre L. Jeller
    Silla Jr, Carlos N.
    2021 IEEE 21ST INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (IEEE BIBE 2021), 2021,
  • [4] Analysis of Feature Selection Techniques for Network Traffic Dataset
    Singh, Raman
    Kumar, Harish
    Singla, R. K.
    2013 INTERNATIONAL CONFERENCE ON MACHINE INTELLIGENCE AND RESEARCH ADVANCEMENT (ICMIRA 2013), 2013, : 42 - 46
  • [5] Development of a classifier with analysis of feature selection methods for COVID-19 diagnosis
    Chauhan, Hetal
    Modi, Kirit
    Shrivastava, Saurabh
    WORLD JOURNAL OF ENGINEERING, 2022, 19 (01) : 49 - 57
  • [6] Development of a classifier with analysis of feature selection methods for COVID-19 diagnosis
    Chauhan, H.
    Modi, K.
    Shrivastava, S.
    WORLD JOURNAL OF ENGINEERING, 2025, 22 (01) : 232 - 232
  • [7] A Comparative Analysis of Feature Selection Algorithms on Classification of Gene Microarray Dataset
    Jeyachidra, J.
    Punithavalli, M.
    2013 INTERNATIONAL CONFERENCE ON INFORMATION COMMUNICATION AND EMBEDDED SYSTEMS (ICICES), 2013, : 1088 - 1093
  • [8] On the Analysis of a Real Dataset of COVID-19 Patients in Alava
    Badiola-Zabala, Goizalde
    Lopez-Guede, Jose Manuel
    Estevez, Julian
    Grana, Manuel
    HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, HAIS 2022, 2022, 13469 : 48 - 59
  • [9] A COVID-19 Rumor Dataset
    Cheng, Mingxi
    Wang, Songli
    Yan, Xiaofeng
    Yang, Tianqi
    Wang, Wenshuo
    Huang, Zehao
    Xiao, Xiongye
    Nazarian, Shahin
    Bogdan, Paul
    FRONTIERS IN PSYCHOLOGY, 2021, 12
  • [10] Enhancing Feature Selection Optimization for COVID-19 Microarray Data
    Krishanthi, Gayani
    Jayetileke, Harshanie
    Wu, Jinran
    Liu, Chanjuan
    Wang, You-Gan
    COVID, 2023, 3 (09): : 1336 - 1355