Evaluation Algorithms Based on Fuzzy C-means for the Data Clustering of Cancer Gene Expression

被引:0
|
作者
Al-Janabee, Omar [1 ]
Al-Sarray, Basad [1 ]
机构
[1] University of Baghdad, College of Science, Computer Science Department, Iraq
关键词
Brain - Cluster analysis - DNA sequences - Fuzzy systems - Genetic algorithms - K-means clustering - Particle swarm optimization (PSO) - Tumors;
D O I
暂无
中图分类号
学科分类号
摘要
The influx of data in bioinformatics is primarily in the form of DNA, RNA, and protein sequences. This condition places a significant burden on scientists and computers. Some genomics studies depend on clustering techniques to group similarly expressed genes into one cluster. Clustering is a type of unsupervised learning that can be used to divide unknown cluster data into clusters. The k-means and fuzzy c-means (FCM) algorithms are examples of algorithms that can be used for clustering. Consequently, clustering is a common approach that divides an input space into several homogeneous zones; it can be achieved using a variety of algorithms. This study used three models to cluster a brain tumor dataset. The first model uses FCM, which is used to cluster genes. FCM allows an object to belong to two or more clusters with a membership grade between zero and one and the sum of belonging to all clusters of each gene is equal to one. This paradigm is useful when dealing with microarray data. The total time required to implement the first model is 22.2589 s. The second model combines FCM and particle swarm optimization (PSO) to obtain better results. The hybrid algorithm, i.e., FCM– PSO, uses the DB index as objective function. The experimental results show that the proposed hybrid FCM–PSO method is effective. The total time of implementation of this model is 89.6087 s. The third model combines FCM with a genetic algorithm (GA) to obtain better results. This hybrid algorithm also uses the DB index as objective function. The experimental results show that the proposed hybrid FCM–GA method is effective. Its total time of implementation is 50.8021 s. In addition, this study uses cluster validity indexes to determine the best partitioning for the underlying data. Internal validity indexes include the Jaccard, Davies Bouldin, Dunn, Xie–Beni, and silhouette. Meanwhile, external validity indexes include Minkowski, adjusted Rand, and percentage of correctly categorized pairings. Experiments conducted on brain tumor gene expression data demonstrate that the techniques used in this study outperform traditional models in terms of stability and biological significance. © 2022 Autoctonía. Revista de Ciencias Sociales e Historia. All rights reserved.
引用
收藏
页码:27 / 41
相关论文
共 50 条
  • [1] Effective fuzzy c-means clustering algorithms for data clustering problems
    Kannan, S. R.
    Ramathilagam, S.
    Chung, P. C.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (07) : 6292 - 6300
  • [2] Fuzzy c-means clustering based on weights and gene expression programming
    Jiang, Zhaohui
    Li, Tingting
    Min, Wenfang
    Qi, Zhao
    Rao, Yuan
    [J]. PATTERN RECOGNITION LETTERS, 2017, 90 : 1 - 7
  • [3] DATA CLUSTERING BASED ON FUZZY C-MEANS AND CHAOTIC WHALE OPTIMIZATION ALGORITHMS
    Arslan, Hatice
    Toz, Metin
    [J]. SIGMA JOURNAL OF ENGINEERING AND NATURAL SCIENCES-SIGMA MUHENDISLIK VE FEN BILIMLERI DERGISI, 2019, 37 (04): : 1103 - 1124
  • [4] Federated c-Means and Fuzzy c-Means Clustering Algorithms for Horizontally and Vertically Partitioned Data
    Bárcena, Jose Luis Corcuera
    Marcelloni, Francesco
    Renda, Alessandro
    Bechini, Alessio
    Ducange, Pietro
    [J]. IEEE Transactions on Artificial Intelligence, 2024, 5 (12): : 6426 - 6441
  • [5] Intuitionistic fuzzy C-means clustering algorithms
    Zeshui Xu1
    2.Institute of Sciences
    3.Department of Information Systems
    [J]. Journal of Systems Engineering and Electronics, 2010, 21 (04) : 580 - 590
  • [6] Intuitionistic fuzzy C-means clustering algorithms
    Xu, Zeshui
    Wu, Junjie
    [J]. JOURNAL OF SYSTEMS ENGINEERING AND ELECTRONICS, 2010, 21 (04) : 580 - 590
  • [7] An improved ant-based algorithm based on heaps merging and fuzzy c-means for clustering cancer gene expression data
    Bulut, Hasan
    Onan, Aytug
    Korukoglu, Serdar
    [J]. SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2020, 45 (01):
  • [8] An improved ant-based algorithm based on heaps merging and fuzzy c-means for clustering cancer gene expression data
    Hasan Bulut
    Aytuğ Onan
    Serdar Korukoğlu
    [J]. Sādhanā, 2020, 45
  • [9] CLUSTERING MICROARRAY GENE EXPRESSION DATA USING FUZZY C-MEANS AND DTW DISTANCE
    Taghizad, H.
    Mehridehnavi, A.
    [J]. 2011 3RD INTERNATIONAL CONFERENCE ON COMPUTER TECHNOLOGY AND DEVELOPMENT (ICCTD 2011), VOL 1, 2012, : 395 - 399
  • [10] A fuzzy clustering model of data and fuzzy c-means
    Nascimento, S
    Mirkin, B
    Moura-Pires, F
    [J]. NINTH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2000), VOLS 1 AND 2, 2000, : 302 - 307