A normalized Levenshtein distance metric

被引:468
|
作者
Li Yujian
Liu Bo
机构
[1] Beijing Univ Technol, Coll Comp Sci & Technol, Beijing 100022, Peoples R China
[2] Beijing Municipal Key Lab Multimedia & Intelligen, Beijing, Peoples R China
关键词
sequence comparison; Levenshtein distance; normalized edit distance; metric; AESA;
D O I
10.1109/TPAMI.2007.1070
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although a number of normalized edit distances presented so far may offer good performance in some applications, none of them can be regarded as a genuine metric between strings because they do not satisfy the triangle inequality. Given two strings X and Y over a finite alphabet, this paper defines a new normalized edit distance between X and Y as a simple function of their lengths (vertical bar X vertical bar and vertical bar Y vertical bar) and the Generalized Levenshtein Distance (GLD) between them. The new distance can be easily computed through GLD with a complexity of O(vertical bar X vertical bar (.) vertical bar Y vertical bar) and it is a metric valued in [0, 1] under the condition that the weight function is a metric over the set of elementary edit operations with all costs of insertions/deletions having the same weight. Experiments using the AESA algorithm in handwritten digit recognition show that the new distance can generally provide similar results to some other normalized edit distances and may perform slightly better if the triangle inequality is violated in a particular data set.
引用
收藏
页码:1091 / 1095
页数:5
相关论文
共 50 条
  • [1] Online Handwriting Recognition Using Levenshtein Distance Metric
    Chowdhury, S. Dutta
    Bhattacharya, U.
    Parui, S. K.
    2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, : 79 - 83
  • [2] Bidirectional Viterbi decoding using the Levenshtein distance metric for deletion channels
    Cheng, Ling
    Ferreira, Hendrik C.
    Swart, Theo G.
    PROCEEDINGS OF 2006 IEEE INFORMATION THEORY WORKSHOP, 2006, : 254 - +
  • [3] REPRESENTING TONE IN LEVENSHTEIN DISTANCE
    Yang, Cathryn
    Castro, Andy
    INTERNATIONAL JOURNAL OF HUMANITIES AND ARTS COMPUTING, 2008, 2 (1-2) : 205 - 219
  • [4] Approximate periods with Levenshtein distance
    Simunek, Martin
    Melichar, Borivoj
    IMPLEMENTATION AND APPLICATION OF AUTOMATA, PROCEEDINGS, 2008, 5148 : 286 - 287
  • [5] Kernels based on weighted Levenshtein distance
    Xu, JH
    Zhang, XG
    2004 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2004, : 3015 - 3018
  • [6] Levenshtein distance for graph spectral features
    Wilson, RC
    Hancock, ER
    PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, 2004, : 489 - 492
  • [7] Pruned convolutional codes and Viterbi decoding using the Levenshtein distance metric applied to asynchronous noisy channels
    Dept. of Electrical and Electronic Engineering Science, University of Johannesburg, P O Box 524, Auckland Park, 2006, South Africa
    Trans. S. Afr. Inst. Electr. Eng., 2006, 2 (140-145):
  • [8] PRUNED CONVOLUTIONAL CODES AND VITERBI DECODING USING THE LEVENSHTEIN DISTANCE METRIC APPLIED TO ASYNCHRONOUS NOISY CHANNELS
    Cheng, L.
    Ferreira, H. C.
    SAIEE AFRICA RESEARCH JOURNAL, 2006, 97 (02): : 140 - 145
  • [9] Computing the Levenshtein distance of a regular language
    Konstantinidis, S
    Proceedings of the IEEE ITSOC Information Theory Workshop 2005 on Coding and Complexity, 2005, : 114 - 117
  • [10] Clustering of web sessions using Levenshtein metric
    Scherbina, A
    Kuznetsov, S
    ADVANCES IN DATA MINING: APPLICATIONS IN IMAGE MINING, MEDICINE AND BIOTECHNOLOGY, MANAGEMENT AND ENVIRONMENTAL CONTROL, AND TELECOMMUNICATIONS, 2004, 3275 : 127 - 133