Compressed String Dictionary Search with Edit Distance One

被引:4
|
作者
Belazzougui, Djamal [1 ]
Venturini, Rossano [2 ]
机构
[1] Univ Helsinki, HIIT, Dept Comp Sci, Helsinki, Finland
[2] Univ Pisa, Dept Comp Sci, Pisa, Italy
基金
芬兰科学院;
关键词
Compressed data structure; Pattern matching; Approximate search; LOOK-UP; STORAGE;
D O I
10.1007/s00453-015-9990-0
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper we present different solutions for the problem of indexing a dictionary of strings in compressed space. Given a pattern P, the index has to report all the strings in the dictionary having edit distance at most one with P. Our first solution is able to solve queries in (almost optimal) O(vertical bar P vertical bar + occ) time where occ is the number of strings in the dictionary having edit distance at most one with P. The space complexity of this solution is bounded in terms of the kth order entropy of the indexed dictionary. A second solution further improves this space complexity at the cost of increasing the query time. Finally, we propose randomized solutions (Monte Carlo and Las Vegas) which achieve simultaneously the time complexity of the first solution and the space complexity of the second one.
引用
收藏
页码:1099 / 1122
页数:24
相关论文
共 50 条
  • [1] Compressed String Dictionary Search with Edit Distance One
    Djamal Belazzougui
    Rossano Venturini
    Algorithmica, 2016, 74 : 1099 - 1122
  • [2] Approximating tree edit distance through string edit distance
    Akutsu, Tatsuya
    Fukagawa, Daiji
    Takasu, Atsuhiro
    ALGORITHMS AND COMPUTATION, PROCEEDINGS, 2006, 4288 : 90 - +
  • [3] Approximating Tree Edit Distance through String Edit Distance
    Akutsu, Tatsuya
    Fukagawa, Daiji
    Takasu, Atsuhiro
    ALGORITHMICA, 2010, 57 (02) : 325 - 348
  • [4] Approximating Tree Edit Distance through String Edit Distance
    Tatsuya Akutsu
    Daiji Fukagawa
    Atsuhiro Takasu
    Algorithmica, 2010, 57 : 325 - 348
  • [5] A unified framework for string similarity search with edit-distance constraint
    Yu, Minghe
    Wang, Jin
    Li, Guoliang
    Zhang, Yong
    Deng, Dong
    Feng, Jianhua
    VLDB JOURNAL, 2017, 26 (02): : 249 - 274
  • [6] minIL: A Simple and Small Index for String Similarity Search with Edit Distance
    Yang, Zhong
    Zheng, Bolong
    Wang, Xianzhi
    Li, Guohui
    Zhou, Xiaofang
    2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022), 2022, : 565 - 577
  • [7] siEDM: An Efficient String Index and Search Algorithm for Edit Distance with Moves
    Takabatake, Yoshimasa
    Nakashima, Kenta
    Kuboyama, Tetsuji
    Tabei, Yasuo
    Sakamoto, Hiroshi
    ALGORITHMS, 2016, 9 (02)
  • [8] Neural String Edit Distance
    Libovicky, Jindrich
    Fraser, Alexander
    PROCEEDINGS OF THE SIXTH WORKSHOP ON STRUCTURED PREDICTION FOR NLP (SPNLP 2022), 2022, : 52 - 66
  • [9] A unified framework for string similarity search with edit-distance constraint
    Minghe Yu
    Jin Wang
    Guoliang Li
    Yong Zhang
    Dong Deng
    Jianhua Feng
    The VLDB Journal, 2017, 26 : 249 - 274
  • [10] Top-k String Similarity Search with Edit-Distance Constraints
    Deng, Dong
    Li, Guoliang
    Feng, Jianhua
    Li, Wen-Syan
    2013 IEEE 29TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2013, : 925 - 936