Applications of Feature Selection Techniques on Large Biomedical Datasets

被引:0
|
作者
Ewen, Nicolas [1 ]
Abdou, Tamer [1 ,2 ]
Bener, Ayse [1 ]
机构
[1] Ryerson Univ, Data Sci Lab, Toronto, ON M5B 2K3, Canada
[2] Arish Univ, Fac Sci, North Sinai 45516, Egypt
来源
关键词
Feature selection; Bio-medical; Large dataset;
D O I
10.1007/978-3-030-18305-9_57
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The main goal of this paper is to determine the best feature selection algorithm to use on large biomedical datasets. Feature Selection shows a potential role in analyzing large biomedical datasets. Four different feature selection techniques have been employed on large biomedical datasets. These techniques were Information Gain, Chi-Squared, Markov Blanket Discovery, and Recursive Feature Elimination. We measured the efficiency of the selection, the stability of the algorithms, and the quality of the features chosen. Of the four techniques used, the Information Gain and Chi-Squared filters were the most efficient and stable. Both Markov Blanket Discovery and Recursive Feature Elimination took significantly longer to select features, and were less stable. The features selected by Recursive Feature Elimination were of the highest quality, followed by Information Gain and Chi-Squared, and Markov Blanket Discovery placed last. For the purpose of education (e.g. those in the biomedical field teaching data techniques), we recommend Information Gain or Chi-Squared filter. For the purpose of research or analyzing, we recommend one of the filters or Recursive Feature Elimination, depending on the situation. We do not recommend the use of Markov Blanket discovery for the situations used in this trial, keeping in mind that the experiments were not exhaustive.
引用
收藏
页码:543 / 548
页数:6
相关论文
共 50 条
  • [41] Feature Selection for Event Extraction in Biomedical Text
    Majumder, Amit
    Hasanuzzaman, Mohammed
    Ekbal, Asif
    2015 EIGHTH INTERNATIONAL CONFERENCE ON ADVANCES IN PATTERN RECOGNITION (ICAPR), 2015, : 241 - +
  • [42] Stochastic feature selection for the discrimination of biomedical spectra
    Pizzi, NJ
    Alexiuk, MD
    Pedrycz, W
    PROCEEDINGS OF THE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), VOLS 1-5, 2005, : 3029 - 3033
  • [43] Feature selection techniques and comparative studies for large-scale manufacturing processes
    Jeong, BW
    Cho, HB
    INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY, 2006, 28 (09): : 1006 - 1011
  • [44] Feature selection techniques and comparative studies for large-scale manufacturing processes
    Buhwan Jeong
    Hyunbo Cho
    The International Journal of Advanced Manufacturing Technology, 2006, 28 : 1006 - 1011
  • [45] Feature selection techniques and comparative studies for large-scale manufacturing processes
    Jeong, Buhwan
    Cho, Hyunbo
    International Journal of Advanced Manufacturing Technology, 2006, 28 (09): : 1006 - 1011
  • [46] Feature Selection Using Metaheuristic Algorithms on Medical Datasets
    Mahendru, Shivam
    Agarwal, Shashank
    HARMONY SEARCH AND NATURE INSPIRED OPTIMIZATION ALGORITHMS, 2019, 741 : 923 - 937
  • [47] A Comparative Study of Feature Selection Methods on Genomic Datasets
    Anaraki, Javad Rahimipour
    Usefi, Hamid
    2019 IEEE 32ND INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS (CBMS), 2019, : 471 - 476
  • [48] A review of microarray datasets and applied feature selection methods
    Bolon-Canedo, V.
    Sanchez-Marono, N.
    Alonso-Betanzos, A.
    Benitez, J. M.
    Herrera, F.
    INFORMATION SCIENCES, 2014, 282 : 111 - 135
  • [49] Feature Selection, Clustering, and Prototype Placement for Turbulence Datasets
    Barone, Matthew
    Ray, Jaideep
    Domino, Stefan
    AIAA JOURNAL, 2022, 60 (03) : 1332 - 1346
  • [50] A Wrapper Method for Feature Selection in Multiple Classes Datasets
    Sanchez-Marono, Noelia
    Alonso-Betanzos, Amparo
    Calvo-Estevez, Rosa M.
    BIO-INSPIRED SYSTEMS: COMPUTATIONAL AND AMBIENT INTELLIGENCE, PT 1, 2009, 5517 : 456 - 463