Gene selection from microarray gene expression data for classification of cancer subgroups employing PSO and adaptive K-nearest neighborhood technique

被引:129
|
作者
Kar, Subhajit [1 ]
Das Sharma, Kaushik [2 ]
Maitra, Madhubanti [3 ]
机构
[1] Future Inst Engn & Management, Dept Elect Engn, Kolkata, India
[2] Univ Calcutta, Dept Appl Phys, Kolkata, India
[3] Jadavpur Univ, Dept Elect Engn, Kolkata, India
关键词
Microarray data; SRBCT data; ALL_AML data; MLL data; Particle swarm optimization (PSO); Adaptive K-nearest neighborhood (KNN); Support vector machine (SVM); PARTICLE SWARM OPTIMIZATION; TYPE-2; FUZZY-LOGIC; NEURAL-NETWORKS; IDENTIFICATION; ALGORITHM; VALIDATION; DESIGN; SYSTEM;
D O I
10.1016/j.eswa.2014.08.014
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
These days, microarray gene expression data are playing an essential role in cancer classifications. However, due to the availability of small number of effective samples compared to the large number of genes in microarray data, many computational methods have failed to identify a small subset of important genes. Therefore, it is a challenging task to identify small number of disease-specific significant genes related for precise diagnosis of cancer sub classes. In this paper, particle swarm optimization (PSO) method along with adaptive K-nearest neighborhood (KNN) based gene selection technique are proposed to distinguish a small subset of useful genes that are sufficient for the desired classification purpose. A proper value of K would help to form the appropriate numbers of neighborhood to be explored and hence to classify the dataset accurately. Thus, a heuristic for selecting the optimal values of K efficiently, guided by the classification accuracy is also proposed. This proposed technique of finding minimum possible meaningful set of genes is applied on three benchmark microarray datasets, namely the small round blue cell tumor (SRBCT) data, the acute lymphoblastic leukemia (ALL) and acute myeloid leukemia (AML) data and the mixed-lineage leukemia (MLL) data. Results demonstrate the usefulness of the proposed method in terms of classification accuracy on blind test samples, number of informative genes and computing time. Further, the usefulness and universal characteristics of the identified genes are reconfirmed by using different classifiers, such as support vector machine (SVM). (C) 2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:612 / 627
页数:16
相关论文
共 50 条
  • [21] Feature selection methods on gene expression microarray data for cancer classification: A systematic review
    Alhenawi, Esra'a
    Al-Sayyed, Rizik
    Hudaib, Amjad
    Mirjalili, Seyedali
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 140
  • [22] Gene selection from microarray data for cancer classification - a machine learning approach
    Wang, Y
    Tetko, IV
    Hall, MA
    Frank, E
    Facius, A
    Mayer, KFX
    Mewes, HW
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2005, 29 (01) : 37 - 46
  • [23] PCA and DWT Based Gene Selection Technique for Classification of Microarray Data
    Nirmalakumari, K.
    Rajaguru, Harikumar
    Rajkumar, P.
    PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON COMMUNICATION AND ELECTRONICS SYSTEMS (ICCES 2018), 2018, : 850 - 854
  • [24] Spatial clustering based gene selection for gene expression analysis in microarray data classification
    Dhas, P. Edwin
    Lalitha, S.
    Govindaraj, Annalakshmi
    Jyoshna, B.
    AUTOMATIKA, 2024, 65 (01) : 152 - 158
  • [25] A hybrid heuristic dimensionality reduction technique for microarray gene expression data classification: a blending of GA, PSO and ACO
    Uma, S. M.
    Kirubakaran, E.
    INTERNATIONAL JOURNAL OF DATA MINING MODELLING AND MANAGEMENT, 2016, 8 (02) : 160 - 179
  • [26] Gene selection for tumor classification using microarray gone expression data
    Yendrapalli, K.
    Basnet, R.
    Mukkamala, S.
    Sung, A. H.
    WORLD CONGRESS ON ENGINEERING 2007, VOLS 1 AND 2, 2007, : 290 - +
  • [27] Gene selection and classification of human lymphoma from microarray data
    Kamruzzaman, J
    Lim, S
    Gondal, I
    Begg, R
    BIOLOGICAL AND MEDICAL DATA ANALYSIS, PROCEEDINGS, 2005, 3745 : 379 - +
  • [28] Gene selection and classification of human lymphoma from microarray data
    Kamruzzaman, J.
    Lim, S.
    Gondal, I.
    Begg, R.
    TENCON 2005 - 2005 IEEE REGION 10 CONFERENCE, VOLS 1-5, 2006, : 195 - +
  • [29] Hybrid Feature Selection Algorithm mRMR-ICA for Cancer Classification from Microarray Gene Expression Data
    Wang, Shuaiqun
    Kong, Wei
    Aorigele
    Deng, Jin
    Gao, Shangce
    Zeng, Weiming
    COMBINATORIAL CHEMISTRY & HIGH THROUGHPUT SCREENING, 2018, 21 (06) : 420 - 430
  • [30] Prediction of Child Tumours from Microarray Gene Expression Data Through Parallel Gene Selection and Classification on Spark
    Lokeswari, Y. V.
    Jacob, Shomona Gracia
    COMPUTATIONAL INTELLIGENCE IN DATA MINING, CIDM 2016, 2017, 556 : 651 - 661