Fidelity of hyperbolic space for Bayesian phylogenetic inference

被引:4
|
作者
Macaulay, Matthew O. [1 ]
Darling, Aaron [2 ]
Fourment, Mathieu O. [1 ]
机构
[1] Univ Technol Sydney, Australian Inst Microbiol & Infect, Sydney, Australia
[2] Illumina Australia Pty Ltd, Sydney, Australia
基金
澳大利亚研究理事会;
关键词
MAXIMUM-LIKELIHOOD; TREE; PERFORMANCE; PROPOSALS;
D O I
10.1371/journal.pcbi.1011084
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Bayesian inference for phylogenetics is a gold standard for computing distributions of phylogenies. However, Bayesian phylogenetics faces the challenging computational problem of moving throughout the high-dimensional space of trees. Fortunately, hyperbolic space offers a low dimensional representation of tree-like data. In this paper, we embed genomic sequences as points in hyperbolic space and perform hyperbolic Markov Chain Monte Carlo for Bayesian inference in this space. The posterior probability of an embedding is computed by decoding a neighbour-joining tree from the embedding locations of the sequences. We empirically demonstrate the fidelity of this method on eight data sets. We systematically investigated the effect of embedding dimension and hyperbolic curvature on the performance in these data sets. The sampled posterior distribution recovers the splits and branch lengths to a high degree over a range of curvatures and dimensions. We systematically investigated the effects of the embedding space's curvature and dimension on the Markov Chain's performance, demonstrating the suitability of hyperbolic space for phylogenetic inference. Author summary Why was this study done? Tree structures are widely used in fields such as phylogenetics, however modifying the layout and branch lengths of these structures simultaniously is a high-dimensional problem. Recent work in machine learning has demonstrated the usefulness of representing tree-like data as points in low dimensional hyperbolic space. We aimed to explore new ways of representing phylogenetic trees so they can be modified in a continuous manner. What did the researchers do and find? We represented trees by the locations of their embedded genomic sequences in hyperbolic space. We perturbed these continuous encoding locations and decoded an altered discrete tree structure. Using this technique, we performed Bayesian inference and computed the posterior distribution of standard eight datasets, to demonstrate the feasibility of phylogenetic inference with this representation. We found that hyperbolic space is suitable for Bayasian phylogenetics and is most efficient across a broad range of hyperbolic curvatures with low dimensionality. What do these findings mean? This method diversifies the way numerical methods can navigate the space of trees both in phylogenetics and more broadly. With hyperbolic embeddings, scaleable online inference is possible by quickly adding taxa to a tree or a distribution of trees. This method could open a wealth of powerful continuum-based methods to navigate the space of trees.
引用
下载
收藏
页数:20
相关论文
共 50 条
  • [21] MrBayes 3: Bayesian phylogenetic inference under mixed models
    Ronquist, F
    Huelsenbeck, JP
    BIOINFORMATICS, 2003, 19 (12) : 1572 - 1574
  • [22] Exact Bayesian inference for phylogenetic birth-death models
    Parag, Kris, V
    Pybus, Oliver G.
    BIOINFORMATICS, 2018, 34 (21) : 3638 - 3645
  • [23] Variational Bayesian inference for association over phylogenetic trees for microorganisms
    Hao, Xiaojuan
    Eskridge, Kent M.
    Wang, Dong
    JOURNAL OF APPLIED STATISTICS, 2022, 49 (05) : 1140 - 1153
  • [24] Early cephalopod evolution clarified through Bayesian phylogenetic inference
    Pohle, Alexander
    Kroeger, Bjoern
    Warnock, Rachel C. M.
    King, Andy H.
    Evans, David H.
    Aubrechtova, Martina
    Cichowolski, Marcela
    Fang, Xiang
    Klug, Christian
    BMC BIOLOGY, 2022, 20 (01)
  • [25] Bayesian phylogenetic inference from animal mitochondrial genome arrangements
    Larget, B
    Simon, DL
    Kadane, JB
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2002, 64 : 681 - 693
  • [26] Early cephalopod evolution clarified through Bayesian phylogenetic inference
    Alexander Pohle
    Björn Kröger
    Rachel C. M. Warnock
    Andy H. King
    David H. Evans
    Martina Aubrechtová
    Marcela Cichowolski
    Xiang Fang
    Christian Klug
    BMC Biology, 20
  • [27] Prior Density Learning in Variational Bayesian Phylogenetic Parameters Inference
    Remita, Amine M.
    Vitae, Golrokh
    Diallo, Abdoulaye Banire
    COMPARATIVE GENOMICS, RECOMB-CG 2023, 2023, 13883 : 112 - 130
  • [28] Bayesian parameter inference for stochastic SIR epidemic model with hyperbolic diffusion
    Qaffou, Abdelaziz
    El Maroufy, Hamid
    Kernane, Tewfik
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2017, 46 (09) : 6907 - 6922
  • [29] FPGA acceleration of the phylogenetic likelihood function for Bayesian MCMC inference methods
    Stephanie Zierke
    Jason D Bakos
    BMC Bioinformatics, 11
  • [30] Phylogenetic inference under recombination using Bayesian stochastic topology selection
    Webb, Alex
    Hancock, John M.
    Holmes, Chris C.
    BIOINFORMATICS, 2009, 25 (02) : 197 - 203