Investigating the performance of AIC in selecting phylogenetic models

被引:8
|
作者
Jhwueng, Dwueng-Chwuan [3 ]
Huzurbazar, Snehalata [4 ,5 ,6 ]
O'Meara, Brian C. [7 ]
Liu, Liang [1 ,2 ]
机构
[1] Univ Georgia, Dept Stat, Athens, GA 30606 USA
[2] Univ Georgia, Inst Bioinformat, Athens, GA 30606 USA
[3] Feng Chia Univ, Dept Stat, Taichung 40724, Taiwan
[4] Stat & Appl Math Sci Inst, Res Triangle Pk, NC 27709 USA
[5] Univ Wyoming, Dept Stat, Laramie, WY 82071 USA
[6] N Carolina State Univ, Dept Stat, Raleigh, NC 27695 USA
[7] Univ Tennessee, Dept Ecol & Evolutionary Biol, Knoxville, TN 37996 USA
基金
美国国家科学基金会;
关键词
AIC; Kullback-Leibler divergence; model selection; phylogenetics; AKAIKE INFORMATION CRITERION; LIKELIHOOD-RATIO TEST; SUBSTITUTION MODELS; DNA-SEQUENCES; EVOLUTION; JMODELTEST; ACCURATE; TESTS; RATES;
D O I
10.1515/sagmb-2013-0048
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The popular likelihood-based model selection criterion, Akaike's Information Criterion (AIC), is a breakthrough mathematical result derived from information theory. AIC is an approximation to Kullback-Leibler (KL) divergence with the derivation relying on the assumption that the likelihood function has finite second derivatives. However, for phylogenetic estimation, given that tree space is discrete with respect to tree topology, the assumption of a continuous likelihood function with finite second derivatives is violated. In this paper, we investigate the relationship between the expected log likelihood of a candidate model, and the expected KL divergence in the context of phylogenetic tree estimation. We find that given the tree topology, AIC is an unbiased estimator of the expected KL divergence. However, when the tree topology is unknown, AIC tends to underestimate the expected KL divergence for phylogenetic models. Simulation results suggest that the degree of underestimation varies across phylogenetic models so that even for large sample sizes, the bias of AIC can result in selecting a wrong model. As the choice of phylogenetic models is essential for statistical phylogenetic inference, it is important to improve the accuracy of model selection criteria in the context of phylogenetics.
引用
收藏
页码:459 / 475
页数:17
相关论文
共 50 条
  • [32] Performance of Akaike Information Criterion and Bayesian Information Criterion in Selecting Partition Models and Mixture Models
    Liu, Qin
    Charleston, Michael A.
    Richards, Shane A.
    Holland, Barbara R.
    [J]. SYSTEMATIC BIOLOGY, 2023, 72 (01) : 92 - 105
  • [33] Jackknife bias correction of the AIC for selecting variables in canonical correlation analysis under model misspecification
    Hashiyama, Yusuke
    Yanagihara, Hirokazu
    Fujikoshi, Yasunori
    [J]. LINEAR ALGEBRA AND ITS APPLICATIONS, 2014, 455 : 82 - 106
  • [34] Conceptual Models and Calibration Performance-Investigating Catchment Bias
    Buzacott, Alexander J. V.
    Tran, Bruce
    van Ogtrop, Floris F.
    Vervoort, R. Willem
    [J]. WATER, 2019, 11 (11)
  • [35] Investigating the impact of pretraining corpora on the performance of Arabic BERT models
    Alammary, Ali Saleh
    [J]. Journal of Supercomputing, 2025, 81 (01):
  • [36] Investigating the Performance of selected Data Storage Concepts for AutomationML Models
    Meixner, Kristof
    Winkler, Dietmar
    Wapp, Michael
    Rosendahl, Ronald
    Biffl, Stefan
    [J]. 45TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY (IECON 2019), 2019, : 2785 - 2791
  • [37] Improving the performance of Bayesian phylogenetic inference under relaxed clock models
    Rong Zhang
    Alexei Drummond
    [J]. BMC Evolutionary Biology, 20
  • [38] Evaluating the Performance of Widely Used Phylogenetic Models for Gene Expression Evolution
    Dimayacyac, Jose Rafael
    Wu, Shanyun
    Jiang, Daohan
    Pennell, Matt
    [J]. GENOME BIOLOGY AND EVOLUTION, 2023, 15 (12):
  • [39] Improving the performance of Bayesian phylogenetic inference under relaxed clock models
    Zhang, Rong
    Drummond, Alexei
    [J]. BMC EVOLUTIONARY BIOLOGY, 2020, 20 (01)
  • [40] A high-dimensional bias-corrected AIC for selecting response variables in multivariate calibration
    Oda, Ryoya
    Mima, Yoshie
    Yanagihara, Hirokazu
    Fujikoshi, Yasunori
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2021, 50 (14) : 3453 - 3476