The multispecies coalescent process models the genealogical relationships of genes sampled from several species, enabling useful predictions about phenomena such as the discordance between a gene tree and the species phylogeny due to incomplete lineage sorting. Conversely, knowledge of large collections of gene trees can inform us about several aspects of the species phylogeny, such as its topology and ancestral population sizes. A fundamental open problem in this context is how to efficiently compute the probability of a gene tree topology, given the species phylogeny. Although a number of algorithms for this task have been proposed, they either produce approximate results, or, when they are exact, they do not scale to large data sets. In this paper, we present some progress towards exact and efficient computation of the probability of a gene tree topology. We provide a new algorithm that, given a species tree and the number of genes sampled for each species, calculates the probability that the gene tree topology will be concordant with the species tree. Moreover, we provide an algorithm that computes the probability of any specific gene tree topology concordant with the species tree. Both algorithms run in polynomial time and have been implemented in Python. Experiments show that they are able to analyze data sets where thousands of genes are sampled in a matter of minutes to hours. (c) 2020 Elsevier Inc. All rights reserved.
机构:
Department of Genetics,Evolution and Environment,University College London
Department of Statistics and Data Science,Southern University of Science and TechnologyDepartment of Genetics,Evolution and Environment,University College London
Xiyun Jiao
论文数: 引用数:
h-index:
机构:
Tomá■ Flouri
Ziheng Yang
论文数: 0引用数: 0
h-index: 0
机构:
Department of Genetics,Evolution and Environment,University College LondonDepartment of Genetics,Evolution and Environment,University College London
机构:
UCL, Dept Genet Evolut & Environm, London WC1E 6BT, England
Southern Univ Sci & Technol, Dept Stat & Data Sci, Shenzhen 518055, Peoples R ChinaUCL, Dept Genet Evolut & Environm, London WC1E 6BT, England
Jiao, Xiyun
Flouri, Tomas
论文数: 0引用数: 0
h-index: 0
机构:
UCL, Dept Genet Evolut & Environm, London WC1E 6BT, EnglandUCL, Dept Genet Evolut & Environm, London WC1E 6BT, England
Flouri, Tomas
Yang, Ziheng
论文数: 0引用数: 0
h-index: 0
机构:
UCL, Dept Genet Evolut & Environm, London WC1E 6BT, EnglandUCL, Dept Genet Evolut & Environm, London WC1E 6BT, England
机构:
Tianjin Med Univ, Sch Basic Med Sci, Dept Biochem & Mol Biol, Tianjin 300070, Peoples R ChinaTianjin Med Univ, Sch Basic Med Sci, Dept Biochem & Mol Biol, Tianjin 300070, Peoples R China
Wu, Shaoyuan
Song, Sen
论文数: 0引用数: 0
h-index: 0
机构:
Tsinghua Univ, Sch Med, Dept Biomed Engn, Beijing 100084, Peoples R ChinaTianjin Med Univ, Sch Basic Med Sci, Dept Biochem & Mol Biol, Tianjin 300070, Peoples R China
Song, Sen
Liu, Liang
论文数: 0引用数: 0
h-index: 0
机构:
Univ Georgia, Dept Stat, Athens, GA 30606 USA
Univ Georgia, Inst Bioinformat, Athens, GA 30606 USATianjin Med Univ, Sch Basic Med Sci, Dept Biochem & Mol Biol, Tianjin 300070, Peoples R China
Liu, Liang
Edwards, Scott V.
论文数: 0引用数: 0
h-index: 0
机构:
Harvard Univ, Museum Comparat Zool, Dept Organism & Evolutionary Biol, Cambridge, MA 02138 USATianjin Med Univ, Sch Basic Med Sci, Dept Biochem & Mol Biol, Tianjin 300070, Peoples R China