Dotted suffix trees - A structure for approximate text indexing

被引:0
|
作者
Coelho, Luis Pedro
Oliveira, Arlindo L.
机构
关键词
string algorithms; suffix trees; approximate text matching; text indexing;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, the problem we address is text indexing for approximate matching. Given a text, T which undergoes some preprocessing to generate an index, we can later query this index to identify the places where a string occurs up to a certain number of errors k (edition distance). The indexing structure occupies space O(n log(k) n) in the average case, independent of alphabet size. This structure can be used to report the existence of a match with k errors in O(3(k) Mk+l) and to report the occurrences in O(3(k) m(k+l) + ed) time, where m is the length of the pattern and ed and the number of matching edit scripts. The construction of the structure has time bound by O(kN\Sigma\), where N is the number of nodes in the index and \Sigma\ the alphabet size.
引用
收藏
页码:329 / 336
页数:8
相关论文
共 50 条
  • [21] Linearized Suffix Tree: an Efficient Index Data Structure with the Capabilities of Suffix Trees and Suffix Arrays
    Kim, Dong Kyue
    Kim, Minhwan
    Park, Heejin
    [J]. ALGORITHMICA, 2008, 52 (03) : 350 - 377
  • [22] On suffix extensions in suffix trees
    Breslauer, Dany
    Italiano, Giuseppe F.
    [J]. THEORETICAL COMPUTER SCIENCE, 2012, 457 : 27 - 34
  • [23] A new indexing method for approximate search in text databases
    Shi, F
    Mefford, C
    [J]. Fifth International Conference on Computer and Information Technology - Proceedings, 2005, : 70 - 76
  • [24] On Suffix Extensions in Suffix Trees
    Breslauer, Dany
    Italiano, Giuseppe F.
    [J]. STRING PROCESSING AND INFORMATION RETRIEVAL, 2011, 7024 : 301 - +
  • [25] From suffix trees to suffix vectors
    Prieur, Elise
    Lecroq, Thierry
    [J]. INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE, 2006, 17 (06) : 1385 - 1402
  • [26] The structure of subword graphs and suffix trees of Fibonacci words
    Rytter, W
    [J]. IMPLEMENTATION AND APPLICATION OF AUTOMATA, 2006, 3845 : 250 - 261
  • [27] The structure of subword graphs and suffix trees of Fibonacci words
    Rytter, Wojciech
    [J]. THEORETICAL COMPUTER SCIENCE, 2006, 363 (02) : 211 - 223
  • [28] PSISA: An Algorithm for Indexing and Searching Protein Structure using Suffix Arrays
    Gharib, Tarek F.
    Salah, Ahmed
    Salem, Abdel-Badeeh M.
    [J]. PROCEEDINGS OF THE 12TH WSEAS INTERNATIONAL CONFERENCE ON COMPUTERS , PTS 1-3: NEW ASPECTS OF COMPUTERS, 2008, : 775 - +
  • [29] Computing suffix links for suffix trees and arrays
    Maass, Moritz G.
    [J]. INFORMATION PROCESSING LETTERS, 2007, 101 (06) : 250 - 254
  • [30] Converting suffix trees into factor/suffix oracles
    Rusu, Irena
    [J]. JOURNAL OF DISCRETE ALGORITHMS, 2008, 6 (02) : 324 - 340