SGAClust: Semi-supervised Graph Attraction Clustering of gene expression data

被引:1
|
作者
Mandal, Koyel [1 ]
Sarmah, Rosy [1 ]
机构
[1] Tezpur Univ, Dept Comp Sci & Engn, Tezpur, Assam, India
关键词
Semi-supervised clustering; Gene expression data; Biomarkers; Cancer disease; SEMANTIC SIMILARITY MEASURES; CANCER CELL-PROLIFERATION; BIOLOGICAL KNOWLEDGE; BIOMARKER DISCOVERY; FEATURE-SELECTION; MICROARRAY DATA; ONTOLOGY; CLASSIFICATION; LEUKEMIA; NETWORK;
D O I
10.1007/s13721-022-00365-3
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Gene expression data clustering groups genes with similar patterns into a group, while genes exhibit dissimilar patterns into different groups. Traditional partitional gene expression data clustering partitions the entire set of genes into a finite set of clusters which might not reflect co-expression or coherent patterns across all genes belonging to a cluster. In this paper, we propose a graph-theoretic clustering algorithm called GAClust which groups co-expressed genes into the same cluster while also detecting noise genes. Clustering of genes is based on the presumption that co-expressed genes are more likely to share common biological functions. However, it has been observed that the clusters produced by traditional methods often do not reflect true biological groups or functions. To address this issue, we propose a semi-supervised algorithm, SGAClust to produce more biologically relevant clusters. We consider both synthetic and cancer gene expression datasets to evaluate the performance of the proposed algorithms. It has been found that SGAClust outperforms the unsupervised algorithms. Additionally, we also identify potential gene biomarkers which will further help in cancer management.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] SGAClust: Semi-supervised Graph Attraction Clustering of gene expression data
    Koyel Mandal
    Rosy Sarmah
    [J]. Network Modeling Analysis in Health Informatics and Bioinformatics, 2022, 11
  • [2] Semi-supervised consensus clustering for gene expression data analysis
    Wang, Yunli
    Pan, Youlian
    [J]. BIODATA MINING, 2014, 7
  • [3] Semi-supervised consensus clustering for gene expression data analysis
    Yunli Wang
    Youlian Pan
    [J]. BioData Mining, 7
  • [4] A semi-supervised fuzzy clustering algorithm applied to gene expression data
    Maraziotis, Ioannis A.
    [J]. PATTERN RECOGNITION, 2012, 45 (01) : 637 - 648
  • [5] A survey on semi-supervised graph clustering
    Daneshfar, Fatemeh
    Soleymanbaigi, Sayvan
    Yamini, Pedram
    Amini, Mohammad Sadra
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 133 (133)
  • [6] Clustering Analysis of Gene Expression Data based on Semi-supervised Visual Clustering Algorithm
    Fu-lai Chung
    Shitong Wang
    Zhaohong Deng
    Chen Shu
    D. Hu
    [J]. Soft Computing, 2006, 10 : 981 - 993
  • [7] Clustering analysis of gene expression data based on semi-supervised visual clustering algorithm
    Chung, Fu-lai
    Wang, Shitong
    Deng, Zhaohong
    Shu, Chen
    Hu, D.
    [J]. SOFT COMPUTING, 2006, 10 (11) : 981 - 993
  • [8] Semi-supervised clustering for gene-expression data in multiobjective optimization framework
    Alok, Abhay Kumar
    Saha, Sriparna
    Ekbal, Asif
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2017, 8 (02) : 421 - 439
  • [9] Semi-supervised clustering for gene-expression data in multiobjective optimization framework
    Abhay Kumar Alok
    Sriparna Saha
    Asif Ekbal
    [J]. International Journal of Machine Learning and Cybernetics, 2017, 8 : 421 - 439
  • [10] Simultaneous Feature Selection and Semi-supervised Clustering for Gene-Expression Data
    Alok, Abhay Kumar
    Saha, Sriparna
    Ekbal, Asif
    Kanekar, Neha
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, INFORMATICS, COMMUNICATION AND ENERGY SYSTEMS (SPICES), 2015,