ChromDMM: a Dirichlet-multinomial mixture model for clustering heterogeneous epigenetic data

被引:1
|
作者
Osmala, Maria [1 ]
Eraslan, Gokcen [2 ]
Lahdesmaki, Harri [1 ]
机构
[1] Aalto Univ, Dept Comp Sci, Espoo 02150, Finland
[2] Broad Inst Harvard & MIT, Klarman Cell Observ, Cambridge, MA 02142 USA
基金
芬兰科学院;
关键词
CHIP-SEQ; CHROMATIN; TRANSCRIPTION;
D O I
10.1093/bioinformatics/btac444
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Research on epigenetic modifications and other chromatin features at genomic regulatory elements elucidates essential biological mechanisms including the regulation of gene expression. Despite the growing number of epigenetic datasets, new tools are still needed to discover novel distinctive patterns of heterogeneous epigenetic signals at regulatory elements. Results We introduce ChromDMM, a product Dirichlet-multinomial mixture model for clustering genomic regions that are characterized by multiple chromatin features. ChromDMM extends the mixture model framework by profile shifting and flipping that can probabilistically account for inaccuracies in the position and strand-orientation of the genomic regions. Owing to hyper-parameter optimization, ChromDMM can also regularize the smoothness of the epigenetic profiles across the consecutive genomic regions. With simulated data, we demonstrate that ChromDMM clusters, shifts and strand-orients the profiles more accurately than previous methods. With ENCODE data, we show that the clustering of enhancer regions in the human genome reveals distinct patterns in several chromatin features. We further validate the enhancer clusters by their enrichment for transcriptional regulatory factor binding sites.
引用
收藏
页码:3863 / 3870
页数:8
相关论文
共 50 条
  • [1] Clustering multivariate count data via Dirichlet-multinomial network fusion
    Zhao, Xin
    Zhang, Jingru
    Lin, Wei
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2023, 179
  • [2] Dirichlet-Multinomial Counterfactual Rewards for Heterogeneous Multiagent Systems
    Dixit, Gaurav
    Zerbel, Nicholas
    Tumer, Kagan
    [J]. 2019 INTERNATIONAL SYMPOSIUM ON MULTI-ROBOT AND MULTI-AGENT SYSTEMS (MRS 2019), 2019, : 209 - 215
  • [3] Interval estimation for the intraclass correlation in Dirichlet-multinomial data
    Lui, KJ
    Cumberland, WG
    Mayer, JA
    Eckhardt, L
    [J]. PSYCHOMETRIKA, 1999, 64 (03) : 355 - 369
  • [4] A Dirichlet-multinomial mixture model-based approach for daily solar radiation classification
    Frimane, Azeddine
    Aggour, Mohammed
    Ouhammou, Badr
    Bahmad, Lahoucine
    [J]. SOLAR ENERGY, 2018, 171 : 31 - 39
  • [5] Interval estimation for the intraclass correlation in dirichlet-multinomial data
    Kung-Jong Lui
    William G. Cumberland
    Joni A. Mayer
    Laura Eckhardt
    [J]. Psychometrika, 1999, 64 : 355 - 369
  • [6] Variational Bayes estimation of hierarchical Dirichlet-multinomial mixtures for text clustering
    Bilancia, Massimo
    Di Nanni, Michele
    Manca, Fabio
    Pio, Gianvito
    [J]. COMPUTATIONAL STATISTICS, 2023, 38 (04) : 2015 - 2051
  • [7] Variational Bayes estimation of hierarchical Dirichlet-multinomial mixtures for text clustering
    Massimo Bilancia
    Michele Di Nanni
    Fabio Manca
    Gianvito Pio
    [J]. Computational Statistics, 2023, 38 : 2015 - 2051
  • [8] Batch effects correction for microbiome data with Dirichlet-multinomial regression
    Dai, Zhenwei
    Wong, Sunny H.
    Yu, Jun
    Wei, Yingying
    [J]. BIOINFORMATICS, 2019, 35 (05) : 807 - 814
  • [9] An integrative Bayesian Dirichlet-multinomial regression model for the analysis of taxonomic abundances in microbiome data
    Wadsworth, W. Duncan
    Argiento, Raffaele
    Guindani, Michele
    Galloway-Pena, Jessica
    Shelburne, Samuel A.
    Vannucci, Marina
    [J]. BMC BIOINFORMATICS, 2017, 18
  • [10] An integrative Bayesian Dirichlet-multinomial regression model for the analysis of taxonomic abundances in microbiome data
    W. Duncan Wadsworth
    Raffaele Argiento
    Michele Guindani
    Jessica Galloway-Pena
    Samuel A. Shelburne
    Marina Vannucci
    [J]. BMC Bioinformatics, 18