Dirichlet Mixtures, the Dirichlet Process, and the Structure of Protein Space

被引:12
|
作者
Viet-An Nguyen [1 ,2 ]
Boyd-Graber, Jordan [2 ,3 ]
Altschul, Stephen F. [4 ]
机构
[1] Univ Maryland, Dept Comp Sci, College Pk, MD 20742 USA
[2] Univ Maryland, UMIACS, College Pk, MD 20742 USA
[3] Univ Maryland, iSch, College Pk, MD 20742 USA
[4] NIH, Natl Ctr Biotechnol Informat, Natl Lib Med, Bethesda, MD 20894 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
alignment; computational molecular biology; dynamic programming; multiple alignment; sequence analysis;
D O I
10.1089/cmb.2012.0244
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The Dirichlet process is used to model probability distributions that are mixtures of an unknown number of components. Amino acid frequencies at homologous positions within related proteins have been fruitfully modeled by Dirichlet mixtures, and we use the Dirichlet process to derive such mixtures with an unbounded number of components. This application of the method requires several technical innovations to sample an unbounded number of Dirichlet-mixture components. The resulting Dirichlet mixtures model multiple-alignment data substantially better than do previously derived ones. They consist of over 500 components, in contrast to fewer than 40 previously, and provide a novel perspective on the structure of proteins. Individual protein positions should be seen not as falling into one of several categories, but rather as arrayed near probability ridges winding through amino acid multinomial space.
引用
收藏
页码:1 / 18
页数:18
相关论文
共 50 条
  • [1] Mixtures of Dirichlet processes according to a Dirichlet process
    Carota, C
    AMERICAN STATISTICAL ASSOCIATION - 1996 PROCEEDINGS OF THE SECTION ON BAYESIAN STATISTICAL SCIENCE, 1996, : 310 - 313
  • [2] Variational learning for Dirichlet process mixtures of Dirichlet distributions and applications
    Wentao Fan
    Nizar Bouguila
    Multimedia Tools and Applications, 2014, 70 : 1685 - 1702
  • [3] Variational learning for Dirichlet process mixtures of Dirichlet distributions and applications
    Fan, Wentao
    Bouguila, Nizar
    MULTIMEDIA TOOLS AND APPLICATIONS, 2014, 70 (03) : 1685 - 1702
  • [4] Online Variational Learning of Dirichlet Process Mixtures of Scaled Dirichlet Distributions
    Narges Manouchehri
    Hieu Nguyen
    Pantea Koochemeshkian
    Nizar Bouguila
    Wentao Fan
    Information Systems Frontiers, 2020, 22 : 1085 - 1093
  • [5] Online Variational Learning of Dirichlet Process Mixtures of Scaled Dirichlet Distributions
    Manouchehri, Narges
    Nguyen, Hieu
    Koochemeshkian, Pantea
    Bouguila, Nizar
    Fan, Wentao
    INFORMATION SYSTEMS FRONTIERS, 2020, 22 (05) : 1085 - 1093
  • [6] Clustering consistency with Dirichlet process mixtures
    Ascolani, F.
    Lijoi, A.
    Rebaudo, G.
    Zanella, G.
    BIOMETRIKA, 2023, 110 (02) : 551 - 558
  • [7] Variational Inference for Dirichlet Process Mixtures
    Blei, David M.
    Jordan, Michael I.
    BAYESIAN ANALYSIS, 2006, 1 (01): : 121 - 143
  • [8] Bayesian Outlier Detection with Dirichlet Process Mixtures
    Shotwell, Matthew S.
    Slate, Elizabeth H.
    BAYESIAN ANALYSIS, 2011, 6 (04): : 665 - 690
  • [9] Dirichlet Process Mixtures of Linear Mixed Regressions
    Kyung, Minjung
    COMMUNICATIONS FOR STATISTICAL APPLICATIONS AND METHODS, 2015, 22 (06) : 625 - 637
  • [10] Bayesian ratemaking under Dirichlet process mixtures
    Zhang, J.
    Huang, J.
    Wu, X.
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2017, 46 (22) : 11327 - 11340