Feature Selection and Clustering of Gene Expression Profiles Using Biological Knowledge

被引:20
|
作者
Mitra, Sushmita [1 ]
Ghosh, Sampreeti [1 ]
机构
[1] Indian Stat Inst, Machine Intelligence Unit, Kolkata 700108, India
关键词
Attribute clustering; clustering large applications based on RAN-domized search (CLARANS); feature selection; gene ontology (GO) medoid; CLASSIFICATION; ALGORITHMS; DATABASE; QUALITY; TOOL;
D O I
10.1109/TSMCC.2012.2209416
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a novel feature selection algorithm, which is governed by biological knowledge, is developed. Gene expression data being high dimensional and redundant, dimensionality reduction is of prime concern. We employ the algorithm clustering large applications based on RAN-domized search (CLARANS) for attribute clustering and dimensionality reduction based on gene ontology (GO) study. Feature selection with unsupervised learning is a difficult problem, with neither class labels present nor any guidance available to the search. Determination of the optimal number of clusters is another major issue, and has an impact on the resulting output. The use of GO analysis helps in the automated selection of biologically meaningful partitions. Tools such as Eisen plot and cluster profiles of these clusters help establish their coherence. Important representative features (or genes) are extracted from each correlated set of genes in such partitions. The algorithm is implemented on high-dimensional Yeast cell-cycle, Human Multiple Tissues, and Leukemia microarray data. In the second pass, clustering on the reduced gene space validates preservation of the inherent behavior of the original high-dimensional expression profiles. While the reduced gene set forms a biologically meaningful gene space, it simultaneously leads to a decrease in computational burden. External validation of the reduced subspace, using various well-known classifiers, establishes the effectiveness of the proposed methodology.
引用
收藏
页码:1590 / 1599
页数:10
相关论文
共 50 条
  • [1] Gene Selection using Biological Knowledge and Fuzzy Clustering
    Ghosh, Sampreeti
    Mitra, Sushmita
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2012,
  • [2] Fuzzy clustering with biological knowledge for gene selection
    Ghosh, Sampreeti
    Mitra, Sushmita
    Dattagupta, Rana
    [J]. APPLIED SOFT COMPUTING, 2014, 16 : 102 - 111
  • [3] Unsupervised gene selection using biological knowledge : application in sample clustering
    Acharya, Sudipta
    Saha, Sriparna
    Nikhil, N.
    [J]. BMC BIOINFORMATICS, 2017, 18
  • [4] Unsupervised gene selection using biological knowledge : application in sample clustering
    Sudipta Acharya
    Sriparna Saha
    N. Nikhil
    [J]. BMC Bioinformatics, 18
  • [5] Application of Biological Domain Knowledge Based Feature Selection on Gene Expression Data
    Yousef, Malik
    Kumar, Abhishek
    Bakir-Gungor, Burcu
    [J]. ENTROPY, 2021, 23 (01) : 1 - 15
  • [6] Feature selection and gene clustering from gene expression data
    Mitra, P
    Majumder, DD
    [J]. PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, 2004, : 343 - 346
  • [7] Informative Feature Clustering and Selection for Gene Expression Data
    Yang, Yuqi
    Yin, Pengshuai
    Luo, Zhihang
    Gu, Wenwen
    Chen, Renjie
    Wu, Qingyao
    [J]. IEEE ACCESS, 2019, 7 : 169174 - 169184
  • [8] PSO Based Feature Selection for Clustering Gene Expression Data
    Deepthi, P. S.
    Thampi, Sabu M.
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, INFORMATICS, COMMUNICATION AND ENERGY SYSTEMS (SPICES), 2015,
  • [9] Boosted unsupervised feature selection for tumor gene expression profiles
    Shi, Yifan
    Yang, Kaixiang
    Wang, Mengzhi
    Yu, Zhiwen
    Zeng, Huanqiang
    Hu, Yang
    [J]. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2024,
  • [10] A Review of Feature Selection Techniques via Gene Expression Profiles
    Ahmad, Farzana Kabir
    Deris, Safaai
    Norwawi, Norita Md.
    Othman, Nor Hayati
    [J]. INTERNATIONAL SYMPOSIUM OF INFORMATION TECHNOLOGY 2008, VOLS 1-4, PROCEEDINGS: COGNITIVE INFORMATICS: BRIDGING NATURAL AND ARTIFICIAL KNOWLEDGE, 2008, : 976 - +