Dotted suffix trees - A structure for approximate text indexing

被引:0
|
作者
Coelho, Luis Pedro
Oliveira, Arlindo L.
机构
关键词
string algorithms; suffix trees; approximate text matching; text indexing;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, the problem we address is text indexing for approximate matching. Given a text, T which undergoes some preprocessing to generate an index, we can later query this index to identify the places where a string occurs up to a certain number of errors k (edition distance). The indexing structure occupies space O(n log(k) n) in the average case, independent of alphabet size. This structure can be used to report the existence of a match with k errors in O(3(k) Mk+l) and to report the occurrences in O(3(k) m(k+l) + ed) time, where m is the length of the pattern and ed and the number of matching edit scripts. The construction of the structure has time bound by O(kN\Sigma\), where N is the number of nodes in the index and \Sigma\ the alphabet size.
引用
收藏
页码:329 / 336
页数:8
相关论文
共 50 条
  • [1] Contracted Suffix Trees: A Simple and Dynamic Text Indexing Data Structure
    Ehrenfeucht, Andrzej
    McConnell, Ross M.
    Woo, Sung-Whan
    [J]. COMBINATORIAL PATTERN MATCHING, PROCEEDINGS, 2009, 5577 : 41 - +
  • [2] Compressed suffix arrays and suffix trees with applications to text indexing and string matching
    Grossi, R
    Vitter, JS
    [J]. SIAM JOURNAL ON COMPUTING, 2005, 35 (02) : 378 - 407
  • [3] Suffix Trays and Suffix Trists: Structures for Faster Text Indexing
    Richard Cole
    Tsvi Kopelowitz
    Moshe Lewenstein
    [J]. Algorithmica, 2015, 72 : 450 - 466
  • [4] Suffix Trays and Suffix Trists: Structures for Faster Text Indexing
    Cole, Richard
    Kopelowitz, Tsvi
    Lewenstein, Moshe
    [J]. ALGORITHMICA, 2015, 72 (02) : 450 - 466
  • [5] Suffix trays and suffix trists: Structures for faster text indexing
    Cole, Richard
    Kopelowitz, Tsvi
    Lewenstein, Moshe
    [J]. AUTOMATA, LANGUAGES AND PROGRAMMING, PT 1, 2006, 4051 : 358 - 369
  • [6] Fast approximate matching using suffix trees
    Cobbs, AL
    [J]. COMBINATORIAL PATTERN MATCHING, 1995, 937 : 41 - 54
  • [7] PSIST: Indexing protein structures using suffix trees
    Gao, F
    Zaki, MJ
    [J]. 2005 IEEE COMPUTATIONAL SYSTEMS BIOINFORMATICS CONFERENCE, PROCEEDINGS, 2005, : 212 - 222
  • [8] New text indexing functionalities of the compressed suffix arrays
    Sadakane, K
    [J]. JOURNAL OF ALGORITHMS-COGNITION INFORMATICS AND LOGIC, 2003, 48 (02): : 294 - 313
  • [9] Faster Compressed Suffix Trees for Repetitive Text Collections
    Navarro, Gonzalo
    Ordonez, Alberto
    [J]. EXPERIMENTAL ALGORITHMS, SEA 2014, 2014, 8504 : 424 - 435
  • [10] The Virtual Suffix Tree: An Efficient Data Structure for Suffix Trees and Suffix Arrays
    Lin, Jie
    Jiang, Yue
    Adjeroh, Don
    [J]. PROCEEDINGS OF THE PRAGUE STRINGOLOGY CONFERENCE 2008, 2008, : 68 - 83