Distributed feature selection (DFS) strategy for microarray gene expression data to improve the classification performance

被引:24
|
作者
Potharaju, Sai Prasad [1 ]
Sreedevi, M. [1 ]
机构
[1] KL Univ, Dept CSE, Guntur, AP, India
来源
CLINICAL EPIDEMIOLOGY AND GLOBAL HEALTH | 2019年 / 7卷 / 02期
关键词
Microarray; Feature selection; Classification; High dimensionality;
D O I
10.1016/j.cegh.2018.04.001
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Objective: The objective of this research article is to present a novel feature selection strategy for improving the classification performance over high dimensional data sets. Curse of dimensionality is the most serious downside of microarray data as it has more number of genes(features). This leads to discouraged computational stability. In microarray data analytics, identifying more relevant features required full attention. Most of the researchers applied two stage strategy for gene expression data analysis. In first stage, feature selection or feature extraction is employed as a preprocessing step to pinpoint more prominent features. In second stage, classification is applied using selected subset of features. Method: In this research also we followed the same strategy. But, we tried to introduce a distributed feature selection(dfs) strategy using Symmetrical Uncertainty(SU) and Multi Layer Perceptron(MLP) by distributing across the multiple clusters. Each cluster is equipped with finite number of features in it. MLP is employed over each cluster, and based on the highest accuracy and lowest Root Mean Square error rate(RMS) dominant cluster is nominated. Result: Classification accuracy with Ridor, Simple Cart (SC), KNN, SVM are measured by considering dominant cluster's features. The performance of this cluster is compared with the traditional filter based ranking techniques like Information Gain(IG), Gain Ratio Attribute Evaluator(GRAE), Chi-Squared Attribute Evaluator (Chi). The proposed method is recorded approximately 57% success rate, 18% competitive rate against traditional methods after applying it over 7 well high dimensional and one lower dimension dataset. Conclusion: The proposed methodology applied over very high dimensional microarry datasets. Using this method memory consumption will be reduced and classification performance can be improved.
引用
收藏
页码:171 / 176
页数:6
相关论文
共 50 条
  • [41] Exploring the consequences of distributed feature selection in DNA microarray data
    Bolon-Canedo, Veronica
    Sechidis, Konstantinos
    Sanchez-Marono, Noelia
    Alonso-Betanzos, Amparo
    Brown, Gavin
    2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 1665 - 1672
  • [42] Semi-supervised SVM-based Feature Selection for Cancer Classification using Microarray Gene Expression Data
    Ang, Jun Chin
    Haron, Habibollah
    Hamed, Haza Nuzly Abdull
    CURRENT APPROACHES IN APPLIED ARTIFICIAL INTELLIGENCE, 2015, 9101 : 468 - 477
  • [43] Feature Selection and Classification for Gene Expression Data using Evolutionary Computation
    Banka, Haider
    Dara, Suresh
    2012 23RD INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA), 2012, : 185 - 189
  • [44] An efficient statistical feature selection approach for classification of gene expression data
    Chandra, B.
    Gupta, Manish
    JOURNAL OF BIOMEDICAL INFORMATICS, 2011, 44 (04) : 529 - 535
  • [45] Mixture feature selection strategy applied in cancer classification from gene expression
    Jin, Xing
    Deng, Yufeng
    Zhong, yixin
    2005 27TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-7, 2005, : 4807 - 4809
  • [46] Improving feature subset selection using a genetic algorithm for microarray gene expression data
    Tan, Feng
    Fu, Xuezheng
    Zhang, Yanqing
    Bourgeois, Anu G.
    2006 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-6, 2006, : 2514 - 2519
  • [47] Hybrid feature selection using micro genetic algorithm on microarray gene expression data
    Pragadeesh, C.
    Jeyaraj, Rohana
    Siranjeevi, K.
    Abishek, R.
    Jeyakumar, G.
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 36 (03) : 2241 - 2246
  • [48] Impact of Feature Selection on Support Vector Machine Using Microarray Gene Expression Data
    Wahid, Choudhury Muhammad Mufassil
    Ali, A. B. M. Shawkat
    Tickle, Kevin
    2009 SECOND INTERNATIONAL CONFERENCE ON MACHINE VISION, PROCEEDINGS, ( ICMV 2009), 2009, : 189 - 193
  • [49] Unsupervised Feature Selection for Microarray Gene Expression Data Based on Discriminative Structure Learning
    Ye, Xiucai
    Sakurai, Tetsuya
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2018, 24 (06) : 725 - 741
  • [50] Feature selection using differential evolution for microarray data classification
    Prajapati S.
    Das H.
    Gourisaria M.K.
    Discover Internet of Things, 2023, 3 (01):