Dirichlet Mixtures, the Dirichlet Process, and the Structure of Protein Space

被引:12
|
作者
Viet-An Nguyen [1 ,2 ]
Boyd-Graber, Jordan [2 ,3 ]
Altschul, Stephen F. [4 ]
机构
[1] Univ Maryland, Dept Comp Sci, College Pk, MD 20742 USA
[2] Univ Maryland, UMIACS, College Pk, MD 20742 USA
[3] Univ Maryland, iSch, College Pk, MD 20742 USA
[4] NIH, Natl Ctr Biotechnol Informat, Natl Lib Med, Bethesda, MD 20894 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
alignment; computational molecular biology; dynamic programming; multiple alignment; sequence analysis;
D O I
10.1089/cmb.2012.0244
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The Dirichlet process is used to model probability distributions that are mixtures of an unknown number of components. Amino acid frequencies at homologous positions within related proteins have been fruitfully modeled by Dirichlet mixtures, and we use the Dirichlet process to derive such mixtures with an unbounded number of components. This application of the method requires several technical innovations to sample an unbounded number of Dirichlet-mixture components. The resulting Dirichlet mixtures model multiple-alignment data substantially better than do previously derived ones. They consist of over 500 components, in contrast to fewer than 40 previously, and provide a novel perspective on the structure of proteins. Individual protein positions should be seen not as falling into one of several categories, but rather as arrayed near probability ridges winding through amino acid multinomial space.
引用
收藏
页码:1 / 18
页数:18
相关论文
共 50 条
  • [21] Bayesian inference for dynamic models with dirichlet process mixtures
    Caron, Francois
    Davy, Manuel
    Doucet, Arnaud
    Duflos, Emmanuel
    Vanheeghe, Philippe
    2006 9th International Conference on Information Fusion, Vols 1-4, 2006, : 138 - 145
  • [22] Dirichlet process mixtures under affine transformations of the data
    Julyan Arbel
    Riccardo Corradin
    Bernardo Nipoti
    Computational Statistics, 2021, 36 : 577 - 601
  • [23] Memorized Variational Continual Learning for Dirichlet Process Mixtures
    Yang, Yang
    Chen, Bo
    Liu, Hongwei
    IEEE ACCESS, 2019, 7 : 150851 - 150862
  • [25] Improving Prediction from Dirichlet Process Mixtures via Enrichment
    Wade, Sara
    Dunson, David B.
    Petrone, Sonia
    Trippa, Lorenzo
    JOURNAL OF MACHINE LEARNING RESEARCH, 2014, 15 : 1041 - 1071
  • [26] Bayesian Population Size Estimation Using Dirichlet Process Mixtures
    Manrique-Vallier, Daniel
    BIOMETRICS, 2016, 72 (04) : 1246 - 1254
  • [27] Dirichlet process mixtures of order statistics with applications to retail analytics
    Pitkin, James
    Ross, Gordon
    Manolopoulou, Ioanna
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2019, 68 (01) : 3 - 28
  • [28] Dirichlet Process Mixture of Mixtures Model for Unsupervised Subword Modeling
    Heck, Michael
    Sakti, Sakriani
    Nakamura, Satoshi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (11) : 2027 - 2042
  • [29] Bayesian Matrix Factorization with Side Information and Dirichlet Process Mixtures
    Porteous, Ian
    Asuncion, Arthur
    Welling, Max
    PROCEEDINGS OF THE TWENTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-10), 2010, : 563 - 568
  • [30] Bayesian inference for linear dynamic models with Dirichlet process mixtures
    Caron, Francois
    Davy, Manuel
    Doucet, Arnaud
    Duflos, Emmanuel
    Vanheeghe, Philippe
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2008, 56 (01) : 71 - 84