Compressed String Dictionary Search with Edit Distance One

被引:0
|
作者
Djamal Belazzougui
Rossano Venturini
机构
[1] University of Helsinki,Department of Computer Science, Helsinki Institute for Information Technology HIIT
[2] University of Pisa,Department of Computer Science
来源
Algorithmica | 2016年 / 74卷
关键词
Compressed data structure; Pattern matching; Approximate search;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper we present different solutions for the problem of indexing a dictionary of strings in compressed space. Given a pattern P\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$P$$\end{document}, the index has to report all the strings in the dictionary having edit distance at most one with P\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$P$$\end{document}. Our first solution is able to solve queries in (almost optimal) O(|P|+occ)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O(|P|+occ)$$\end{document} time where occ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$occ$$\end{document} is the number of strings in the dictionary having edit distance at most one with P\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$P$$\end{document}. The space complexity of this solution is bounded in terms of the k\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k$$\end{document}th order entropy of the indexed dictionary. A second solution further improves this space complexity at the cost of increasing the query time. Finally, we propose randomized solutions (Monte Carlo and Las Vegas) which achieve simultaneously the time complexity of the first solution and the space complexity of the second one.
引用
收藏
页码:1099 / 1122
页数:23
相关论文
共 50 条
  • [41] Human action recognition using an improved string edit distance
    Foggia, Pasquale
    Gauzere, Benoit
    Saggese, Alessia
    Vento, Mario
    [J]. 2015 12TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS), 2015,
  • [42] Embedding edit distance to enable private keyword search
    Bringer, Julien
    Chabanne, Herve
    [J]. HUMAN-CENTRIC COMPUTING AND INFORMATION SCIENCES, 2012, 2
  • [43] Similar Supergraph Search Based on Graph Edit Distance
    Yamada, Masataka
    Inokuchi, Akihiro
    [J]. ALGORITHMS, 2021, 14 (08)
  • [44] Embedding edit distance to enable private keyword search
    Bringer, Julien
    Chabanne, Hervé
    [J]. Chabanne, Hervé (herve.chabanne@morpho.com), 1600, Springer Science and Business Media Deutschland GmbH (02): : 1 - 12
  • [45] Compressed Dictionary Matching With One Error
    Hon, Wing-Kai
    Ku, Tsung-Han
    Shah, Rahul
    Thankachan, Sharma V.
    Vitter, Jeffrey Scott
    [J]. 2011 DATA COMPRESSION CONFERENCE (DCC), 2011, : 113 - 122
  • [46] Computing the Shortest String and the Edit-Distance for Parsing Expression Languages
    Cheon, Hyunjoon
    Han, Yo-Sub
    [J]. DEVELOPMENTS IN LANGUAGE THEORY, DLT 2020, 2020, 12086 : 43 - 54
  • [47] PARAMETRIC STRING EDIT DISTANCE AND ITS APPLICATION TO PATTERN-RECOGNITION
    BUNKE, H
    CSIRIK, J
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1995, 25 (01): : 202 - 206
  • [48] Understanding Cloud Data Using Approximate String Matching and Edit Distance
    Jupin, Joseph
    Shi, Justin Y.
    Obradovic, Zoran
    [J]. 2012 SC COMPANION: HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SCC), 2012, : 1234 - 1243
  • [49] CONTEXT DEPENDENT PHONETIC STRING EDIT DISTANCE FOR AUTOMATIC SPEECH RECOGNITION
    Droppo, Jasha
    Acero, Alex
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4358 - 4361
  • [50] Restricted transposition invariant approximate string matching under edit distance
    Hyyro, Heikki
    [J]. String Processing and Information Retrieval, Proceedings, 2005, 3772 : 256 - 266