INTEGRATIVE MODEL-BASED CLUSTERING OF MICROARRAY METHYLATION AND EXPRESSION DATA

被引:19
|
作者
Kormaksson, Matthias [1 ]
Booth, James G. [1 ]
Figueroa, Maria E. [2 ]
Melnick, Ari [2 ]
机构
[1] Cornell Univ, Dept Stat Sci, Ithaca, NY 14853 USA
[2] Weill Cornell Med Coll, Dept Med, Hematol Oncol Div, New York, NY 10065 USA
来源
ANNALS OF APPLIED STATISTICS | 2012年 / 6卷 / 03期
关键词
Integrative model-based clustering; microarray data; mixture models; EM algorithm; methylation; expression; AML; GENE-EXPRESSION; VARIABLE SELECTION; MIXTURE; LIKELIHOOD; ALGORITHM;
D O I
10.1214/11-AOAS533
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In many fields, researchers are interested in large and complex biological processes. Two important examples are gene expression and DNA methylation in genetics. One key problem is to identify aberrant patterns of these processes and discover biologically distinct groups. In this article we develop a model-based method for clustering such data. The basis of our method involves the construction of a likelihood for any given partition of the subjects. We introduce cluster specific latent indicators that, along with some standard assumptions, impose a specific mixture distribution on each cluster. Estimation is carried out using the EM algorithm. The methods extend naturally to multiple data types of a similar nature, which leads to an integrated analysis over multiple data platforms, resulting in higher discriminating power.
引用
收藏
页码:1327 / 1347
页数:21
相关论文
共 50 条
  • [1] A mixture model-based approach to the clustering of microarray expression data
    McLachlan, GJ
    Bean, RW
    Peel, D
    [J]. BIOINFORMATICS, 2002, 18 (03) : 413 - 422
  • [2] Incorporating gene functions as priors in model-based clustering of microarray gene expression data
    Pan, W
    [J]. BIOINFORMATICS, 2006, 22 (07) : 795 - 801
  • [3] Model-based clustering of microarray expression data via latent Gaussian mixture models
    McNicholas, Paul D.
    Murphy, Thomas Brendan
    [J]. BIOINFORMATICS, 2010, 26 (21) : 2705 - 2712
  • [4] Semi-parametric model-based clustering for DNA microarray data
    Han, Bohyung
    Davis, Larry S.
    [J]. 18TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 3, PROCEEDINGS, 2006, : 324 - +
  • [5] Model-based clustering and data transformations for gene expression data
    Yeung, KY
    Fraley, C
    Murua, A
    Raftery, AE
    Ruzzo, WL
    [J]. BIOINFORMATICS, 2001, 17 (10) : 977 - 987
  • [6] CRCView: a web server for analyzing and visualizing microarray gene expression data using model-based clustering
    Xiang, Zuoshuang
    Qin, Zhaohui S.
    He, Yongqun
    [J]. BIOINFORMATICS, 2007, 23 (14) : 1843 - 1845
  • [7] Clustering microarray data using model-based double K-means
    Martella, Francesca
    Vichi, Maurizio
    [J]. JOURNAL OF APPLIED STATISTICS, 2012, 39 (09) : 1853 - 1869
  • [8] Model-based cluster analysis of microarray gene-expression data
    Wei Pan
    Jizhen Lin
    Chap T Le
    [J]. Genome Biology, 3 (2):
  • [9] Model-based cluster analysis of microarray gene-expression data
    Pan, Wei
    Lin, Jizhen
    Le, Chap T.
    [J]. GENOME BIOLOGY, 2002, 3 (02):
  • [10] MODEL-BASED CLUSTERING WITH DATA CORRECTION FOR REMOVING ARTIFACTS IN GENE EXPRESSION DATA
    Young, William Chad
    Raftery, Adrian E.
    Yeung, Ka Yee
    [J]. ANNALS OF APPLIED STATISTICS, 2017, 11 (04): : 1998 - 2026