Accuracy and power of Bayes prediction of amino acid sites under positive selection

被引:303
|
作者
Anisimova, M
Bielawski, JP
Yang, ZH
机构
[1] UCL, Dept Biol, Galton Lab, London WC1E 6BT, England
[2] UCL, Ctr Math & Phys Life Sci & Expt Biol, London WC1E 6BT, England
关键词
Bayes inference; likelihood; nonsynonymous-synonymous rate ratio; positive selection; posterior probability;
D O I
10.1093/oxfordjournals.molbev.a004152
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Bayes prediction quantifies uncertainty by assigning posterior probabilities. It Was used to identify amino acids in a protein under recurrent diversifying selection indicated by higher nonsynonymous, (d(N)) than synonymous (d(S)) substitution rates or by omega = d(N)/d(S) > 1. Parameters were estimated by maximum likelihood under a codon substitution model that assumed several classes of sites with different w ratios. The Bayes theorem was used to calculate the posterior probabilities of each site falling into these site classes. Here. we evaluate the performance of Bayes prediction of amino acids under positive selection by computer simulation. We measured the accuracy by the proportion of predicted sites that were truly under selection and the power by the proportion of true positively selected sites that were predicted by the method. The accuracy was slightly better for longer sequences, whereas the power was largely unaffected by the increase in sequence length. Both accuracy and power were higher for medium or highly diverged sequences than for similar sequences. We found that accuracy and power were unacceptably low when data contained only a few highly similar sequences. However, sampling a large number of lineage improved the performance substantially. Even for very similar sequences. accuracy and Power can he high if over 100 taxa are used in the analysis. We make the following recommendations: (1) prediction of positive selection sites is not feasible for a few closely related sequences: (2) using it large number of lineages is the best way to improve the accuracy and power of the prediction: and (3) multiple models of heterogeneous selective pressures among sites should he applied in real data analysis.
引用
收藏
页码:950 / 958
页数:9
相关论文
共 50 条
  • [41] Prediction of citrullination sites by incorporating k-spaced amino acid pairs into Chou's general pseudo amino acid composition
    Ju, Zhe
    Wang, Shi-Yun
    GENE, 2018, 664 : 78 - 83
  • [42] Analysis of Prediction Accuracy under the Selection of Optimum Time Granularity in Different Metro Stations
    Li, Peikun
    Ma, Chaoqun
    Ning, Jing
    Wang, Yun
    Zhu, Caihua
    SUSTAINABILITY, 2019, 11 (19)
  • [43] Persistency of Prediction Accuracy and Genetic Gain in Synthetic Populations Under Recurrent Genomic Selection
    Mueller, Dominik
    Schopp, Pascal
    Melchinger, Albrecht E.
    G3-GENES GENOMES GENETICS, 2017, 7 (03): : 801 - 811
  • [44] Bayesian analysis suggests that most amino acid replacements in Drosophila are driven by positive selection
    Sawyer, SA
    Kulathinal, RJ
    Bustamante, CD
    Hartl, DL
    JOURNAL OF MOLECULAR EVOLUTION, 2003, 57 (Suppl 1) : S154 - S164
  • [45] Radical amino acid change versus positive selection in the evolution of viral envelope proteins
    Hanada, Kousuke
    Gojobori, Takashi
    Li, Wen-Hsiung
    GENE, 2006, 385 : 83 - 88
  • [46] Positive selection of paclitaxel biosynthetic genes detected at both nucleotide and amino acid levels
    Hao, Da Cheng
    Mu, Jun
    PROCEEDINGS OF THE 2009 2ND INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING AND INFORMATICS, VOLS 1-4, 2009, : 1490 - 1493
  • [47] Bayesian Analysis Suggests that Most Amino Acid Replacements in Drosophila Are Driven by Positive Selection
    Stanley A. Sawyer
    Rob J. Kulathinal
    Carlos D. Bustamante
    Daniel L. Hartl
    Journal of Molecular Evolution, 2003, 57 : S154 - S164
  • [48] Amino Acid Templating Mechanisms in Selection of Nucleotides Opposite Abasic Sites by a Family A DNA Polymerase
    Obeid, Samra
    Welte, Wolfram
    Diederichs, Kay
    Marx, Andreas
    JOURNAL OF BIOLOGICAL CHEMISTRY, 2012, 287 (17) : 14099 - 14108
  • [49] Estimating the influence of selection on the variable amino acid sites of the cytochrome b protein functional domains
    McClellan, DA
    McCracken, KG
    MOLECULAR BIOLOGY AND EVOLUTION, 2001, 18 (06) : 917 - 925
  • [50] Multiple hypothesis testing to detect lineages under positive selection that affects only a few sites
    Anisimova, Maria
    Yang, Ziheng
    MOLECULAR BIOLOGY AND EVOLUTION, 2007, 24 (05) : 1219 - 1228