How Well Does Your Phylogenetic Model Fit Your Data?

被引:14
|
作者
Shepherd, Daisy A. [1 ]
Klaere, Steffen [1 ,2 ]
机构
[1] Univ Auckland, Dept Stat, Private Bag 92019, Auckland 1142, New Zealand
[2] Univ Auckland, Sch Biol Sci, Auckland, New Zealand
关键词
GOODNESS-OF-FIT; MAXIMUM-LIKELIHOOD; STATISTICAL TESTS; SEQUENCE DATA; TREE; SITES; EVOLUTION; SELECTION; RECONSTRUCTION; CHARACTERS;
D O I
10.1093/sysbio/syy066
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The test for model-to-data fitness is a fundamental principle within the statistical sciences. The purpose of such a test is to assess whether the selected best-fitting model adequately describes the behavior in the data. Despite their broad application across many areas of statistics, goodness of fit tests for phylogenetic models have received much less attention than model selection methods in the last decade. At present a number of approaches have been suggested. However, these are often flawed, with problems ranging from the presence of systematic error in the models themselves to the difficulties presented by the nature of phylogenetic data. Ultimately these problems lead to an inadequate choice of statistic. This is one of the main reasons why goodness of fit assessment is often a neglected step within phylogenetic analysis. We argue not only for the necessity of these goodness of fit measures to test how well the model reflects the data, but additionally for the need for useful tests that explain why the model-to-data fit may be inadequate. Such tests are a critical part of the model building process, allowing the model to be adapted to provide a better model-to-data fit or to reject a model class outright due to such an inadequate fit that the intended use of the class may be compromised. Proposed and existing methods in both the maximum likelihood and Bayesian framework will be discussed here, whilst highlighting their strengths and limitations for assessing goodness of fit. The final section discusses some critical open statistical problems in goodness of fit assessment for this field, with the hope of encouraging more research into such a fundamental yet underdeveloped area of phylogenetic inference. [Bayesian phylogenetics; Goodness of fit; maximum likelihood; molecular phylogenetics; outlier detection; residual diagnostics.].
引用
收藏
页码:157 / 167
页数:11
相关论文
共 50 条
  • [1] "How Well Does Your Structural Equation Model Fit Your Data?": Is Marcoulides and Yuan's Equivalence Test the Answer?
    Peugh, James
    Feldon, David F.
    CBE-LIFE SCIENCES EDUCATION, 2020, 19 (03): : 1 - 8
  • [2] How Fit Are Your Data?
    Bedard, L. Paul
    Barnes, Sarah-Jane
    GEOSTANDARDS AND GEOANALYTICAL RESEARCH, 2010, 34 (03) : 275 - 280
  • [3] HOW WELL DOES YOUR PHOTOPOLYMER CURE
    THOMAS, LC
    RESEARCH & DEVELOPMENT, 1987, 29 (11): : 86 - 90
  • [4] How well does your NDT work?
    Cargill, JS
    MATERIALS EVALUATION, 2001, 59 (07) : 833 - +
  • [5] Does Your Implementation Fit Your Theory of Change?
    Montague, Steve
    CANADIAN JOURNAL OF PROGRAM EVALUATION, 2019, 33 (03) : 316 - 335
  • [6] Does Your Lab Coat Fit to Your Assay?
    Busch, Michael
    Thoma, Heinz Bjoern
    Kober, Ingo
    JOURNAL OF BIOMOLECULAR SCREENING, 2013, 18 (06) : 744 - 747
  • [7] DOES YOUR MANUSCRIPT FIT IN
    SQUIRES, BP
    CANADIAN MEDICAL ASSOCIATION JOURNAL, 1992, 146 (04) : 463 - 463
  • [8] How well does your ruby laser work?
    Philip A Wright
    Daniel C Widdowson
    Salim Ahmed
    Peter G Shakespeare
    Lasers in Medical Science, 2005, 20 : 104 - 106
  • [9] How well does your ruby laser work?
    Wright, PA
    Widdowson, DC
    Ahmed, S
    Shakespeare, PG
    LASERS IN MEDICAL SCIENCE, 2005, 20 (02) : 104 - 106
  • [10] How well does your sampler really work?
    Turner, Ryan
    Neal, Brady
    UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2018, : 73 - 82