SuMoTED: An intuitive edit distance between rooted unordered uniquely-labelled trees

被引:6
|
作者
McVicar, Matt [1 ]
Sach, Benjamin [2 ]
Mesnage, Cedric [1 ]
Lijffijt, Jefrey [1 ,3 ]
Spyropoulou, Eirini [1 ]
De Bie, Tijl [1 ,3 ]
机构
[1] Univ Bristol, Dept Engn Math, Woodland Rd, Bristol BS8 1UB, England
[2] Univ Bristol, Dept Comp Sci, Woodland Rd, Bristol BS8 1UB, England
[3] Univ Ghent, Data Sci Lab, B-9000 Ghent, Belgium
基金
欧洲研究理事会; 英国工程与自然科学研究理事会;
关键词
Tree edit distance; Taxonomies; ALGORITHMS;
D O I
10.1016/j.patrec.2016.04.012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Defining and computing distances between tree structures is a classical area of study in theoretical computer science, with practical applications in the areas of computational biology, information retrieval, text analysis, and many others. In this paper, we focus on rooted, unordered, uniquely-labelled trees such as taxonomies and other hierarchies. For trees as these, we introduce the intuitive concept of a 'local move' operation as an atomic edit of a tree. We then introduce SuMoTED, a new edit distance measure between such trees, defined as the minimal number of local moves required to convert one tree into another. We show how SuMoTED can be computed using a scalable algorithm with quadratic time complexity. Finally, we demonstrate its use on a collection of music genre taxonomies. (C) 2016TheAuthors. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
引用
收藏
页码:52 / 59
页数:8
相关论文
共 31 条
  • [21] Edit distance between unlabelled ordered trees
    Micheli, A
    Rossin, D
    MATHEMATICS AND COMPUTER SCIENCE III: ALGORITHMS, TREES, COMBINATORICS AND PROBABILITIES, 2004, : 257 - 259
  • [22] ON A MATCHING DISTANCE BETWEEN ROOTED PHYLOGENETIC TREES
    Bogdanowicz, Damian
    Giaro, Krzysztof
    INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE, 2013, 23 (03) : 669 - 684
  • [23] A new measure of edit distance between labeled trees
    Lu, CL
    Su, ZY
    Tang, CY
    COMPUTING AND COMBINATORICS, 2001, 2108 : 338 - 348
  • [24] A new constrained edit distance between quotiented ordered trees
    Ouangraoua, Aïda
    Ferraro, Pascal
    Journal of Discrete Algorithms, 2009, 7 (01): : 78 - 89
  • [25] An Efficient Algorithm for the Rooted Triplet Distance Between Galled Trees
    Jansson, Jesper
    Rajaby, Ramesh
    Sung, Wing-Kin
    ALGORITHMS FOR COMPUTATIONAL BIOLOGY (ALCOB 2017), 2017, 10252 : 115 - 126
  • [26] An Efficient Algorithm for the Rooted Triplet Distance Between Galled Trees
    Jansson, Jesper
    Rajaby, Ramesh
    Sung, Wing-Kin
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2019, 26 (09) : 893 - 907
  • [27] A constrained edit distance algorithm between semi-ordered trees
    Ouangraoua, Aida
    Ferraro, Pascal
    THEORETICAL COMPUTER SCIENCE, 2009, 410 (8-10) : 837 - 846
  • [28] Computing the rooted triplet distance between galled trees by counting triangles
    Jansson, Jesper
    Lingas, Andrzej
    JOURNAL OF DISCRETE ALGORITHMS, 2014, 25 : 66 - 78
  • [29] Tree edit distance for leaf-labelled trees on free leafset and its comparison with frequent subsplit dissimilarity and popular distance measures
    Koperwas, Jakub
    Walczak, Krzysztof
    BMC BIOINFORMATICS, 2011, 12
  • [30] Tree edit distance for leaf-labelled trees on free leafset and its comparison with frequent subsplit dissimilarity and popular distance measures
    Jakub Koperwas
    Krzysztof Walczak
    BMC Bioinformatics, 12