Estimating the number of clusters in DNA microarray data

被引:0
|
作者
Bolshakova, N [1 ]
Azuaje, F
机构
[1] Trinity Coll Dublin, Dept Comp Sci, Dublin 2, Ireland
[2] Univ Ulster, Sch Comp & Math, Jordanstown BT52 1SA, North Ireland
关键词
gene expression; data mining; clustering; cluster evaluation; validity indices;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objectives. The main objective of the research is an application of the clustering and cluster validity methods to estimate the number of clusters in cancer tumor datasets. A weighed voting technique is going to be used to improve the prediction of the number of clusters based on different data mining techniques. These tools may be used for the identification of new tumour classes using DNA microarray datasets. This estimation approach may perform a useful tool to support biological and biomedical knowledge discovery. Methods:Three clustering and two validation algorithms were applied to two cancer tumor dotasets. Recent studies confirm that there is no universal pattern recognition and clustering model to predict molecular profiles across different datasets. Thus, it is useful not to rely on one single clustering or validation method, but to apply a variety of approaches. Therefore, combination of these methods may be successfully used for the estimation of the number of clusters. Results. The methods implemented in this research may contribute to the validation of clustering results and the estimation of the number of clusters. The results show that this estimation approach may represent an effective tool to support biomedical knowledge discovery and healthcare applications. Conclusion: The methods implemented in this research may be successfully used for the estimation of the number of clusters. The methods implemented in this research may contribute to the validation of clustering results and the estimation of the number of clusters. These tools may be used for the identification of new tumour classes using gene expression profiles.
引用
收藏
页码:153 / 157
页数:5
相关论文
共 50 条
  • [1] Estimating the number of clusters in microarray data sets based on an information theoretic criterion
    Nicorici, Daniel
    Astola, Jaakko
    Yli-Harja, Olli
    [J]. 2005 IEEE/SP 13TH WORKSHOP ON STATISTICAL SIGNAL PROCESSING (SSP), VOLS 1 AND 2, 2005, : 936 - 940
  • [2] Estimating the number of clusters in a ranking data context
    Calmon, Wilson
    Albi, Mariana
    [J]. INFORMATION SCIENCES, 2021, 546 : 977 - 995
  • [3] NIFTI: An evolutionary approach for finding number of clusters in microarray data
    Sudhakar Jonnalagadda
    Rajagopalan Srinivasan
    [J]. BMC Bioinformatics, 10
  • [4] NIFTI: An evolutionary approach for finding number of clusters in microarray data
    Jonnalagadda, Sudhakar
    Srinivasan, Rajagopalan
    [J]. BMC BIOINFORMATICS, 2009, 10
  • [5] Estimating the number of clusters
    Cuevas, A
    Febrero, M
    Fraiman, R
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2000, 28 (02): : 367 - 382
  • [6] The distribution of the number of false discoveries in DNA microarray data
    Desai, Keyur
    Deller, J. R., Jr.
    McCormick, J. J.
    [J]. 2007 IEEE/SP 14TH WORKSHOP ON STATISTICAL SIGNAL PROCESSING, VOLS 1 AND 2, 2007, : 205 - +
  • [7] Estimating dataset size requirements for classifying DNA microarray data
    Mukherjee, S
    Tamayo, P
    Rogers, S
    Rifkin, R
    Engle, A
    Campbell, C
    Golub, TR
    Mesirov, JP
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2003, 10 (02) : 119 - 142
  • [8] Estimating the number of clusters in a data set via the gap statistic
    Tibshirani, R
    Walther, G
    Hastie, T
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2001, 63 : 411 - 423
  • [9] A hybrid method for estimating the predominant number of clusters in a data set
    Al Shaqsi, Jamil
    Wang, Wenjia
    [J]. 2012 11TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2012), VOL 2, 2012, : 569 - 573
  • [10] Structured Bi-clusters Algorithm for Classification of DNA Microarray Data
    Foszner, Pawel
    Polanski, Andrzej
    [J]. INFORMATION TECHNOLOGIES IN MEDICINE (ITIB 2016), VOL 2, 2016, 472 : 161 - 171