Relevancy contemplation in medical data analytics and ranking of feature selection algorithms

被引:0
|
作者
Seba, P. Antony [1 ]
Benifa, J. V. Bibal [2 ]
机构
[1] Indian Inst Informat Technol Kottayam, Dept Comp Sci & Engn, Kottayam, Kerala, India
[2] Informat Technol Kottayam, Dept Comp Sci & Engn, Kottayam, Kerala, India
关键词
data contemplation; DEA; feature selection; TOPSIS; CLASSIFIER;
D O I
10.4218/etrij.2022-0018
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This article performs a detailed data scrutiny on a chronic kidney disease (CKD) dataset to select efficient instances and relevant features. Data relevancy is investigated using feature extraction, hybrid outlier detection, and handling of missing values. Data instances that do not influence the target are removed using data envelopment analysis to enable reduction of rows. Column reduction is achieved by ranking the attributes through feature selection methodologies, namely, extra-trees classifier, recursive feature elimination, chi-squared test, analysis of variance, and mutual information. These methodologies are ranked via Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) using weight optimization to identify the optimal features for model building from the CKD dataset to facilitate better prediction while diagnosing the severity of the disease. An efficient hybrid ensemble and novel similarity-based classifiers are built using the pruned dataset, and the results are thereafter compared with random forest, AdaBoost, naive Bayes, k-nearest neighbors, and support vector machines. The hybrid ensemble classifier yields a better prediction accuracy of 98.31% for the features selected by extra tree classifier (ETC), which is ranked as the best by TOPSIS.
引用
收藏
页码:448 / 461
页数:14
相关论文
共 50 条
  • [41] Unsupervised spectral feature selection algorithms for high dimensional data
    Wang, Mingzhao
    Han, Henry
    Huang, Zhao
    Xie, Juanying
    FRONTIERS OF COMPUTER SCIENCE, 2023, 17 (05)
  • [42] An Evaluation of Feature Selection and Reduction Algorithms for Network IDS Data
    Bjerkestrand, Therese
    Tsaptsinos, Dimitris
    Pfluegel, Eckhard
    2015 International Conference on Cyber Situational Awareness, Data Analytics and Assessment (CyberSA), 2015,
  • [43] Stable feature selection and classification algorithms for multiclass microarray data
    Sebastian Student
    Krzysztof Fujarewicz
    Biology Direct, 7
  • [44] Unsupervised spectral feature selection algorithms for high dimensional data
    Mingzhao WANG
    Henry HAN
    Zhao HUANG
    Juanying XIE
    Frontiers of Computer Science, 2023, 17 (05) : 31 - 44
  • [45] Stable feature selection and classification algorithms for multiclass microarray data
    Student, Sebastian
    Fujarewicz, Krzysztof
    BIOLOGY DIRECT, 2012, 7
  • [46] Two Stages Feature Selection Based on Filter Ranking Methods and SVMRFE on Medical Applications
    Djellali, Hayet
    Zine, Nacira Ghoualmi
    Azizi, Nabiha
    MODELLING AND IMPLEMENTATION OF COMPLEX SYSTEMS, MISC 2016, 2016, : 281 - 293
  • [47] Improving the ranking quality of medical image retrieval using a genetic feature selection method
    da Silva, Sergio Francisco
    Ribeiro, Marcela Xavier
    Batista Neto, Joao do E. S.
    Traina-, Caetano, Jr.
    Traina, Agma J. M.
    DECISION SUPPORT SYSTEMS, 2011, 51 (04) : 810 - 820
  • [48] A feature selection method with feature ranking using genetic programming
    Liu, Guopeng
    Ma, Jianbin
    Hu, Tongle
    Gao, Xiaoying
    CONNECTION SCIENCE, 2022, 34 (01) : 1146 - 1168
  • [49] Feature subset selection and feature ranking for multivariate time series
    Yoon, H
    Yang, KY
    Shahabi, C
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (09) : 1186 - 1198
  • [50] An Adaptive Multiple Feature Subset Method for Feature Ranking and Selection
    Chang, Fu
    Chen, Jen-Cheng
    INTERNATIONAL CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI 2010), 2010, : 255 - 262