A generalized Robinson-Foulds distance for labeled trees

被引:11
|
作者
Briand, Samuel [1 ]
Dessimoz, Christophe [2 ,3 ,4 ,5 ,6 ]
El-Mabrouk, Nadia [1 ]
Lafond, Manuel [7 ]
Lobinska, Gabriela [3 ]
机构
[1] Univ Montreal, Comp Sci Dept, Montreal, PQ, Canada
[2] Univ Lausanne, Dept Computat Biol, Lausanne, Switzerland
[3] UCL, Dept Genet Evolut & Environm, London, England
[4] Univ Lausanne, Ctr Integrat Genom, Lausanne, Switzerland
[5] Swiss Inst Bioinformat, Lausanne, Switzerland
[6] UCL, Dept Comp Sci, London, England
[7] Univ Sherbrooke, Comp Sci Dept, Sherbrooke, PQ, Canada
基金
加拿大自然科学与工程研究理事会; 瑞士国家科学基金会;
关键词
Edit distance; Labeled trees; Robinson-Foulds; Tree metric; PHYLOGENETIC TREES;
D O I
10.1186/s12864-020-07011-0
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background The Robinson-Foulds (RF) distance is a well-established measure between phylogenetic trees. Despite a lack of biological justification, it has the advantages of being a proper metric and being computable in linear time. For phylogenetic applications involving genes, however, a crucial aspect of the trees ignored by the RF metric is the type of the branching event (e.g. speciation, duplication, transfer, etc). Results We extend RF to trees with labeled internal nodes by including a node flip operation, alongside edge contractions and extensions. We explore properties of this extended RF distance in the case of a binary labeling. In particular, we show that contrary to the unlabeled case, an optimal edit path may require contracting "good" edges, i.e. edges shared between the two trees. Conclusions We provide a 2-approximation algorithm which is shown to perform well empirically. Looking ahead, computing distances between labeled trees opens up a variety of new algorithmic directions.Implementation and simulations available at .
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Linear-Time Algorithms for Some Phylogenetic Tree Completion Problems Under Robinson-Foulds Distance
    Bansal, Mukul S.
    [J]. COMPARATIVE GENOMICS (RECOMB-CG 2018), 2018, 11183 : 209 - 226
  • [32] Using Robinson-Foulds supertrees in divide-and-conquer phylogeny estimation
    Xilin Yu
    Thien Le
    Sarah A. Christensen
    Erin K. Molloy
    Tandy Warnow
    [J]. Algorithms for Molecular Biology, 16
  • [33] Using Robinson-Foulds supertrees in divide-and-conquer phylogeny estimation
    Yu, Xilin
    Le, Thien
    Christensen, Sarah A.
    Molloy, Erin K.
    Warnow, Tandy
    [J]. ALGORITHMS FOR MOLECULAR BIOLOGY, 2021, 16 (01)
  • [34] A sublinear-time randomized approximation scheme for the robinson-foulds metric
    Pattengale, Nicholas D.
    Moret, Bernard M. E.
    [J]. Lect. Notes Comput. Sci., 1600, (221-230):
  • [35] A sublinear-time randomized approximation scheme for the Robinson-Foulds metric
    Pattengale, Nicholas D.
    Moret, Bernard M. E.
    [J]. RESEARCH IN COMPUTATIONAL MOLECULAR BIOLOGY, PROCEEDINGS, 2006, 3909 : 221 - 230
  • [36] FastRFS: fast and accurate Robinson-Foulds Supertrees using constrained exact optimization
    Vachaspati, Pranjal
    Warnow, Tandy
    [J]. BIOINFORMATICS, 2017, 33 (05) : 631 - 639
  • [37] Building alternative consensus trees and supertrees using k-means and Robinson and Foulds distance
    Tahiri, Nadia
    Fichet, Bernard
    Makarenkov, Vladimir
    [J]. BIOINFORMATICS, 2022, 38 (13) : 3367 - 3376
  • [38] The Connection of the Generalized Robinson–Foulds Metric with Partial Wiener Indices
    Damir Vukičević
    Domagoj Matijević
    [J]. Acta Biotheoretica, 2023, 71
  • [39] Invariant transformers of robinson and foulds distance matrices for convolutional neural network
    Tahiri, Nadia
    Veriga, Andrey
    Koshkarov, Aleksandr
    Morozov, Boris
    [J]. JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2022, 20 (04)
  • [40] A program to compute the soft Robinson–Foulds distance between phylogenetic networks
    Bingxin Lu
    Louxin Zhang
    Hon Wai Leong
    [J]. BMC Genomics, 18