Missing data and the accuracy of Bayesian phylogenetics

被引:156
|
作者
Wiens, John J. [1 ]
Moen, Daniel S. [1 ]
机构
[1] SUNY Stony Brook, Dept Ecol & Evolut, Stony Brook, NY 11794 USA
关键词
accuracy; Bayesian analysis; missing data; phylogenetic analysis;
D O I
10.3724/SP.J.1002.2008.08040
中图分类号
Q94 [植物学];
学科分类号
071001 ;
摘要
The effect of missing data on phylogenetic methods is a potentially important issue in our attempts to reconstruct the Tree of Life. If missing data are truly problematic, then it may be unwise to include species in an analysis that lack data for some characters (incomplete taxa) or to include characters that lack data for some species. Given the difficulty of obtaining data from all characters for all taxa (e.g., fossils), missing data might seriously impede efforts to reconstruct a comprehensive phylogeny that includes all species. Fortunately, recent simulations and empirical analyses suggest that missing data cells are not themselves problematic, and that incomplete taxa can be accurately placed as long as the overall number of characters in the analysis is large. However, these studies have so far only been conducted on parsimony, likelihood, and neighbor joining methods. Although Bayesian phylogenetic methods have become widely used in recent years, the effects of missing data on Bayesian analysis have not been adequately studied. Here, we conduct simulations to test whether Bayesian analyses can accurately place incomplete taxa despite extensive missing data. In agreement with previous studies of other methods, we find that Bayesian analyses can accurately reconstruct the position of highly incomplete taxa (i.e., 95% missing data), as long as the overall number of characters in the analysis is large. These results suggest that highly incomplete taxa can be safely included in many Bayesian phylogenetic analyses.
引用
收藏
页码:307 / 314
页数:8
相关论文
共 50 条
  • [1] Data Integration in Bayesian Phylogenetics
    Hassler, Gabriel W.
    Magee, Andrew F.
    Zhang, Zhenyu
    Baele, Guy
    Lemey, Philippe
    Ji, Xiang
    Fourment, Mathieu
    Suchard, Marc A.
    ANNUAL REVIEW OF STATISTICS AND ITS APPLICATION, 2023, 10 : 353 - 377
  • [2] BAYESIAN IMPUTATION FOR MISSING DATA
    Nads, Azman A.
    Polestico, Daisy Lou L.
    ADVANCES AND APPLICATIONS IN STATISTICS, 2022, 79 : 83 - 104
  • [3] BAYESIAN ESTIMATION WITH MISSING DATA
    CAPERAA, P
    ATTI DELLA ACCADEMIA NAZIONALE DEI LINCEI RENDICONTI-CLASSE DI SCIENZE FISICHE-MATEMATICHE & NATURALI, 1973, 54 (06): : 887 - 891
  • [4] The importance of data partitioning and the utility of bayes factors in Bayesian phylogenetics
    Brown, Jeremy M.
    Lemmon, Alan R.
    SYSTEMATIC BIOLOGY, 2007, 56 (04) : 643 - 655
  • [5] Bayesian phylogenetics of Bryozoa
    Tsyganov-Bodounov, Anton
    Hayward, Peter J.
    Porter, Joanne S.
    Skibinski, David O. F.
    MOLECULAR PHYLOGENETICS AND EVOLUTION, 2009, 52 (03) : 904 - 910
  • [6] Scalable Bayesian phylogenetics
    Fisher, Alexander A.
    Hassler, Gabriel W.
    Ji, Xiang
    Baele, Guy
    Suchard, Marc A.
    Lemey, Philippe
    PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2022, 377 (1861)
  • [7] Bayesian Nonparametrics for Causal Inference and Missing Data
    Hahn, P. Richard
    Daniels, Michael J.
    Linero, Antonio
    Roy, Jason
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2024,
  • [8] OPTIMAL BAYESIAN FEATURE SELECTION WITH MISSING DATA
    Pour, Ali Foroughi
    Dalton, Lori A.
    2016 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2016, : 35 - 39
  • [9] Bayesian modeling of missing data in clinical research
    Austin, PC
    Escobar, MD
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2005, 49 (03) : 821 - 836
  • [10] Bayesian methods for dealing with missing data problems
    Zhihua Ma
    Guanghui Chen
    Journal of the Korean Statistical Society, 2018, 47 : 297 - 313