Bounded Occurrence Edit Distance: A New Metric for String Similarity Joins with Edit Distance Constraints

被引:0
|
作者
Komatsu, Tomoki [1 ]
Okuta, Ryosuke [1 ]
Narisawa, Kazuyuki [1 ]
Shinohara, Ayumi [1 ]
机构
[1] Tohoku Univ, Grad Sch Informat Sci, Sendai, Miyagi 980, Japan
关键词
Edit distance; Similarity join problem; Similarity search; Data integration;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Given two sets of strings and a similarity function on strings, similarity joins attempt to find all similar pairs of strings from each respective set. In this paper, we focus on similarity joins with respect to the edit distance, and propose a new metric called the bounded occurrence edit distance and a filter based on the metric. Using the filter, we can reduce the total time required to solve similarity joins because the metric can be computed faster than the edit distance by bitwise operations. We demonstrate the effectiveness of the filter through experiments.
引用
下载
收藏
页码:363 / 374
页数:12
相关论文
共 50 条
  • [41] Fast Similarity Search for Graphs by Edit Distance
    Rachkovskij, D. A.
    CYBERNETICS AND SYSTEMS ANALYSIS, 2019, 55 (06) : 1039 - 1051
  • [42] The Edit Distance as a Measure of Perceived Rhythmic Similarity
    Post, Olaf
    Toussaint, Godfried
    EMPIRICAL MUSICOLOGY REVIEW, 2011, 6 (03): : 164 - 179
  • [43] Graph Edit Distance or Graph Edit Pseudo-Distance?
    Serratosa, Francesc
    Cortes, Xavier
    Moreno, Carlos-Francisco
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2016, 2016, 10029 : 530 - 540
  • [44] Graph Similarity Using Tree Edit Distance
    Dwivedi, Shri Prakash
    Srivastava, Vishal
    Gupta, Umesh
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, S+SSPR 2022, 2022, 13813 : 233 - 241
  • [45] Fast Similarity Search for Graphs by Edit Distance
    D. A. Rachkovskij
    Cybernetics and Systems Analysis, 2019, 55 : 1039 - 1051
  • [46] Online Pattern Matching for String Edit Distance with Moves
    Takabatake, Yoshimasa
    Tabei, Yasuo
    Sakamoto, Hiroshi
    STRING PROCESSING AND INFORMATION RETRIEVAL, SPIRE 2014, 2014, 8799 : 203 - 214
  • [47] Online signature verification based on string edit distance
    Riesen, Kaspar
    Schmidt, Roman
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2019, 22 (01) : 41 - 54
  • [48] Computing the Expected Edit Distance from a String to a PFA
    Calvo-Zaragoza, Jorge
    de la Higuera, Colin
    Oncina, Jose
    Implementation and Application of Automata, 2016, 9705 : 39 - 50
  • [49] Compressed String Dictionary Search with Edit Distance One
    Belazzougui, Djamal
    Venturini, Rossano
    ALGORITHMICA, 2016, 74 (03) : 1099 - 1122
  • [50] An algorithm for string edit distance allowing substring reversals
    Arslan, Abdullah N.
    BIBE 2006: SIXTH IEEE SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING, PROCEEDINGS, 2006, : 220 - +