Compressed String Dictionary Search with Edit Distance One

被引:0
|
作者
Djamal Belazzougui
Rossano Venturini
机构
[1] University of Helsinki,Department of Computer Science, Helsinki Institute for Information Technology HIIT
[2] University of Pisa,Department of Computer Science
来源
Algorithmica | 2016年 / 74卷
关键词
Compressed data structure; Pattern matching; Approximate search;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper we present different solutions for the problem of indexing a dictionary of strings in compressed space. Given a pattern P\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$P$$\end{document}, the index has to report all the strings in the dictionary having edit distance at most one with P\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$P$$\end{document}. Our first solution is able to solve queries in (almost optimal) O(|P|+occ)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O(|P|+occ)$$\end{document} time where occ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$occ$$\end{document} is the number of strings in the dictionary having edit distance at most one with P\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$P$$\end{document}. The space complexity of this solution is bounded in terms of the k\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k$$\end{document}th order entropy of the indexed dictionary. A second solution further improves this space complexity at the cost of increasing the query time. Finally, we propose randomized solutions (Monte Carlo and Las Vegas) which achieve simultaneously the time complexity of the first solution and the space complexity of the second one.
引用
收藏
页码:1099 / 1122
页数:23
相关论文
共 50 条
  • [1] Compressed String Dictionary Search with Edit Distance One
    Belazzougui, Djamal
    Venturini, Rossano
    [J]. ALGORITHMICA, 2016, 74 (03) : 1099 - 1122
  • [2] Approximating tree edit distance through string edit distance
    Akutsu, Tatsuya
    Fukagawa, Daiji
    Takasu, Atsuhiro
    [J]. ALGORITHMS AND COMPUTATION, PROCEEDINGS, 2006, 4288 : 90 - +
  • [3] Approximating Tree Edit Distance through String Edit Distance
    Akutsu, Tatsuya
    Fukagawa, Daiji
    Takasu, Atsuhiro
    [J]. ALGORITHMICA, 2010, 57 (02) : 325 - 348
  • [4] Approximating Tree Edit Distance through String Edit Distance
    Tatsuya Akutsu
    Daiji Fukagawa
    Atsuhiro Takasu
    [J]. Algorithmica, 2010, 57 : 325 - 348
  • [5] A unified framework for string similarity search with edit-distance constraint
    Yu, Minghe
    Wang, Jin
    Li, Guoliang
    Zhang, Yong
    Deng, Dong
    Feng, Jianhua
    [J]. VLDB JOURNAL, 2017, 26 (02): : 249 - 274
  • [6] minIL: A Simple and Small Index for String Similarity Search with Edit Distance
    Yang, Zhong
    Zheng, Bolong
    Wang, Xianzhi
    Li, Guohui
    Zhou, Xiaofang
    [J]. 2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022), 2022, : 565 - 577
  • [7] siEDM: An Efficient String Index and Search Algorithm for Edit Distance with Moves
    Takabatake, Yoshimasa
    Nakashima, Kenta
    Kuboyama, Tetsuji
    Tabei, Yasuo
    Sakamoto, Hiroshi
    [J]. ALGORITHMS, 2016, 9 (02)
  • [8] Neural String Edit Distance
    Libovicky, Jindrich
    Fraser, Alexander
    [J]. PROCEEDINGS OF THE SIXTH WORKSHOP ON STRUCTURED PREDICTION FOR NLP (SPNLP 2022), 2022, : 52 - 66
  • [9] A unified framework for string similarity search with edit-distance constraint
    Minghe Yu
    Jin Wang
    Guoliang Li
    Yong Zhang
    Dong Deng
    Jianhua Feng
    [J]. The VLDB Journal, 2017, 26 : 249 - 274
  • [10] Top-k String Similarity Search with Edit-Distance Constraints
    Deng, Dong
    Li, Guoliang
    Feng, Jianhua
    Li, Wen-Syan
    [J]. 2013 IEEE 29TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2013, : 925 - 936