Fidelity of hyperbolic space for Bayesian phylogenetic inference

被引:4
|
作者
Macaulay, Matthew O. [1 ]
Darling, Aaron [2 ]
Fourment, Mathieu O. [1 ]
机构
[1] Univ Technol Sydney, Australian Inst Microbiol & Infect, Sydney, Australia
[2] Illumina Australia Pty Ltd, Sydney, Australia
基金
澳大利亚研究理事会;
关键词
MAXIMUM-LIKELIHOOD; TREE; PERFORMANCE; PROPOSALS;
D O I
10.1371/journal.pcbi.1011084
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Bayesian inference for phylogenetics is a gold standard for computing distributions of phylogenies. However, Bayesian phylogenetics faces the challenging computational problem of moving throughout the high-dimensional space of trees. Fortunately, hyperbolic space offers a low dimensional representation of tree-like data. In this paper, we embed genomic sequences as points in hyperbolic space and perform hyperbolic Markov Chain Monte Carlo for Bayesian inference in this space. The posterior probability of an embedding is computed by decoding a neighbour-joining tree from the embedding locations of the sequences. We empirically demonstrate the fidelity of this method on eight data sets. We systematically investigated the effect of embedding dimension and hyperbolic curvature on the performance in these data sets. The sampled posterior distribution recovers the splits and branch lengths to a high degree over a range of curvatures and dimensions. We systematically investigated the effects of the embedding space's curvature and dimension on the Markov Chain's performance, demonstrating the suitability of hyperbolic space for phylogenetic inference. Author summary Why was this study done? Tree structures are widely used in fields such as phylogenetics, however modifying the layout and branch lengths of these structures simultaniously is a high-dimensional problem. Recent work in machine learning has demonstrated the usefulness of representing tree-like data as points in low dimensional hyperbolic space. We aimed to explore new ways of representing phylogenetic trees so they can be modified in a continuous manner. What did the researchers do and find? We represented trees by the locations of their embedded genomic sequences in hyperbolic space. We perturbed these continuous encoding locations and decoded an altered discrete tree structure. Using this technique, we performed Bayesian inference and computed the posterior distribution of standard eight datasets, to demonstrate the feasibility of phylogenetic inference with this representation. We found that hyperbolic space is suitable for Bayasian phylogenetics and is most efficient across a broad range of hyperbolic curvatures with low dimensionality. What do these findings mean? This method diversifies the way numerical methods can navigate the space of trees both in phylogenetics and more broadly. With hyperbolic embeddings, scaleable online inference is possible by quickly adding taxa to a tree or a distribution of trees. This method could open a wealth of powerful continuum-based methods to navigate the space of trees.
引用
下载
收藏
页数:20
相关论文
共 50 条
  • [31] Bayesian phylogenetic inference under a statistical insertion-deletion model
    Lunter, G
    Miklós, I
    Drummond, A
    Jensen, JL
    Hein, J
    ALGORITHMS IN BIOINFORMATICS, PROCEEDINGS, 2003, 2812 : 228 - 244
  • [32] FPGA Acceleration of the phylogenetic likelihood function for Bayesian MCMC inference methods
    Zierke, Stephanie
    Bakos, Jason D.
    BMC BIOINFORMATICS, 2010, 11
  • [33] Bayesian phylogenetic inference via Markov chain Monte Carlo methods
    Mau, B
    Newton, MA
    Larget, B
    BIOMETRICS, 1999, 55 (01) : 1 - 12
  • [34] An examination of the monophyly of morning glory taxa using Bayesian phylogenetic inference
    Miller, RE
    Buckley, TR
    Manos, PS
    SYSTEMATIC BIOLOGY, 2002, 51 (05) : 740 - 753
  • [35] Bayesian Phylogenetic Inference Using a Combinatorial Sequential Monte Carlo Method
    Wang, Liangliang
    Bouchard-Cote, Alexandre
    Doucet, Arnaud
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2015, 110 (512) : 1362 - 1374
  • [36] A reversible jump method for Bayesian phylogenetic inference with a nonhomogeneous substitution model
    Gowri-Shankar, Vivek
    Rattray, Magnus
    MOLECULAR BIOLOGY AND EVOLUTION, 2007, 24 (06) : 1286 - 1299
  • [37] MrBayes sMC3: Accelerating Bayesian inference of phylogenetic trees
    Kuan, Lidia
    Pratas, Frederico
    Sousa, Leonel
    Tomas, Pedro
    INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2018, 32 (02): : 246 - 265
  • [38] Improving the performance of Bayesian phylogenetic inference under relaxed clock models
    Zhang, Rong
    Drummond, Alexei
    BMC EVOLUTIONARY BIOLOGY, 2020, 20 (01)
  • [39] Improving the performance of Bayesian phylogenetic inference under relaxed clock models
    Rong Zhang
    Alexei Drummond
    BMC Evolutionary Biology, 20
  • [40] Clock-constrained Tree Proposal Operators in Bayesian Phylogenetic Inference
    Hoehna, Sebastian
    Defoin-Platel, Michael
    Drummond, Alexei J.
    8TH IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING, VOLS 1 AND 2, 2008, : 78 - 84