A normalized Levenshtein distance metric

被引:468
|
作者
Li Yujian
Liu Bo
机构
[1] Beijing Univ Technol, Coll Comp Sci & Technol, Beijing 100022, Peoples R China
[2] Beijing Municipal Key Lab Multimedia & Intelligen, Beijing, Peoples R China
关键词
sequence comparison; Levenshtein distance; normalized edit distance; metric; AESA;
D O I
10.1109/TPAMI.2007.1070
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although a number of normalized edit distances presented so far may offer good performance in some applications, none of them can be regarded as a genuine metric between strings because they do not satisfy the triangle inequality. Given two strings X and Y over a finite alphabet, this paper defines a new normalized edit distance between X and Y as a simple function of their lengths (vertical bar X vertical bar and vertical bar Y vertical bar) and the Generalized Levenshtein Distance (GLD) between them. The new distance can be easily computed through GLD with a complexity of O(vertical bar X vertical bar (.) vertical bar Y vertical bar) and it is a metric valued in [0, 1] under the condition that the weight function is a metric over the set of elementary edit operations with all costs of insertions/deletions having the same weight. Experiments using the AESA algorithm in handwritten digit recognition show that the new distance can generally provide similar results to some other normalized edit distances and may perform slightly better if the triangle inequality is violated in a particular data set.
引用
收藏
页码:1091 / 1095
页数:5
相关论文
共 50 条
  • [31] Enhancing levenshtein distance algorithm for assessing behavioral trust
    Pirmez, Luci
    Carmo, Luiz F. R. C.
    Bacellar, Luiz F. H.
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2010, 25 (01): : 5 - 13
  • [32] Indo-European languages tree by Levenshtein distance
    Serva, M.
    Petroni, F.
    EPL, 2008, 81 (06)
  • [33] Levenshtein Distance Embedding with Poisson Regression for DNA Storage
    Wei, Xiang
    Guo, Alan J. X.
    Sun, Sihan
    Wei, Mengyi
    Yu, Wei
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 14, 2024, : 15796 - 15804
  • [34] Slice Distance: An Insert-only Levenshtein Distance with a Focus on Security Applications
    Afzal, Zeeshan
    Garcia, Johan
    Lindskog, Stefan
    Brunstrom, Anna
    2018 9TH IFIP INTERNATIONAL CONFERENCE ON NEW TECHNOLOGIES, MOBILITY AND SECURITY (NTMS), 2018,
  • [35] The Normalized Freebase Distance
    Godin, Frederic
    De Nies, Tom
    Beecks, Christian
    De Vocht, Laurens
    De Neve, Wesley
    Mannens, Erik
    Seidl, Thomas
    Van de Walle, Rik
    SEMANTIC WEB: ESWC 2014 SATELLITE EVENTS, 2014, 8798 : 218 - 221
  • [36] The normalized distance Laplacian
    Reinhart, Carolyn
    arXiv, 2019,
  • [37] The normalized distance Laplacian
    Reinhart, Carolyn
    SPECIAL MATRICES, 2021, 9 (01): : 1 - 18
  • [38] Rate-compatible pruned convolutional codes and Viterbi decoding with the Levenshtein distance metric applied to channels with insertion, deletion, and substitution errors
    Cheng, L
    Ferreira, HC
    2004 IEEE AFRICON: 7TH AFRICON CONFERENCE IN AFRICA, VOLS 1 AND 2: TECHNOLOGY INNOVATION, 2004, : 137 - 143
  • [39] The similarity metric and the distance metric
    Ma, B
    Zhang, KZ
    PROCEEDINGS OF THE 8TH JOINT CONFERENCE ON INFORMATION SCIENCES, VOLS 1-3, 2005, : 1239 - 1242
  • [40] On the similarity metric and the distance metric
    Chen, Shihyen
    Ma, Bin
    Zhang, Kaizhong
    THEORETICAL COMPUTER SCIENCE, 2009, 410 (24-25) : 2365 - 2376