Relevancy contemplation in medical data analytics and ranking of feature selection algorithms

被引:0
|
作者
Seba, P. Antony [1 ]
Benifa, J. V. Bibal [2 ]
机构
[1] Indian Inst Informat Technol Kottayam, Dept Comp Sci & Engn, Kottayam, Kerala, India
[2] Informat Technol Kottayam, Dept Comp Sci & Engn, Kottayam, Kerala, India
关键词
data contemplation; DEA; feature selection; TOPSIS; CLASSIFIER;
D O I
10.4218/etrij.2022-0018
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This article performs a detailed data scrutiny on a chronic kidney disease (CKD) dataset to select efficient instances and relevant features. Data relevancy is investigated using feature extraction, hybrid outlier detection, and handling of missing values. Data instances that do not influence the target are removed using data envelopment analysis to enable reduction of rows. Column reduction is achieved by ranking the attributes through feature selection methodologies, namely, extra-trees classifier, recursive feature elimination, chi-squared test, analysis of variance, and mutual information. These methodologies are ranked via Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) using weight optimization to identify the optimal features for model building from the CKD dataset to facilitate better prediction while diagnosing the severity of the disease. An efficient hybrid ensemble and novel similarity-based classifiers are built using the pruned dataset, and the results are thereafter compared with random forest, AdaBoost, naive Bayes, k-nearest neighbors, and support vector machines. The hybrid ensemble classifier yields a better prediction accuracy of 98.31% for the features selected by extra tree classifier (ETC), which is ranked as the best by TOPSIS.
引用
收藏
页码:448 / 461
页数:14
相关论文
共 50 条
  • [21] Memetic algorithms for feature selection on microarray data
    Zhu, Zexuan
    Ong, Yew-Soon
    ADVANCES IN NEURAL NETWORKS - ISNN 2007, PT 1, PROCEEDINGS, 2007, 4491 : 1327 - +
  • [22] Feature Selection Using Metaheuristic Algorithms on Medical Datasets
    Mahendru, Shivam
    Agarwal, Shashank
    HARMONY SEARCH AND NATURE INSPIRED OPTIMIZATION ALGORITHMS, 2019, 741 : 923 - 937
  • [23] Data visualization and feature selection: New algorithms for nongaussian data
    Yang, HH
    Moody, J
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 12, 2000, 12 : 687 - 693
  • [24] Wrapper for ranking feature selection
    Ruiz, R
    Aguilar-Ruiz, JS
    Riquelme, JC
    INTELLIGENT DAA ENGINEERING AND AUTOMATED LEARNING IDEAL 2004, PROCEEDINGS, 2004, 3177 : 384 - 389
  • [25] Ranking a random feature for variable and feature selection
    Stoppiglia, Hervé
    Dreyfus, Gérard
    Dubois, Rémi
    Oussar, Yacine
    Journal of Machine Learning Research, 2003, 3 : 1399 - 1414
  • [26] Stable bagging feature selection on medical data
    Alelyani, Salem
    JOURNAL OF BIG DATA, 2021, 8 (01)
  • [27] Stable bagging feature selection on medical data
    Salem Alelyani
    Journal of Big Data, 8
  • [28] Streaming feature selection algorithms for big data: A survey
    AlNuaimi, Noura
    Masud, Mohammad Mehedy
    Serhani, Mohamed Adel
    Zaki, Nazar
    APPLIED COMPUTING AND INFORMATICS, 2022, 18 (1/2) : 113 - 135
  • [29] Feature Selection in Enterprise Analytics: A Demonstration using an R-based Data Analytics System
    Konda, Pradap
    Kumar, Arun
    Re, Christopher
    Sashikanth, Vaishnavi
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2013, 6 (12): : 1306 - 1309
  • [30] QLGWONM:Quantum Leaping GWO for Feature Selection in Big Data Analytics
    Rachna Kulhare
    S.Veenadhari
    Journal of Harbin Institute of Technology(New series), 2023, 30 (04) : 85 - 98