Fast frequent string mining using suffix arrays

被引:8
|
作者
Fischer, J [1 ]
Heun, V [1 ]
Kramer, S [1 ]
机构
[1] Univ Munich, Inst Informat, D-80333 Munich, Germany
来源
FIFTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS | 2005年
关键词
D O I
10.1109/ICDM.2005.62
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a method to mine strings that are frequent in one database and infrequent in another The method uses suffix- and lcp-arrays that can be computed extremely fast and space efficiently, and further exhibit a good locality behavior Experiments with several biologically relevant data sets show that our approach out performs existing inethods in terms of time and space.
引用
收藏
页码:609 / 612
页数:4
相关论文
共 50 条
  • [1] Approximate string matching using compressed suffix arrays
    Huynh, TND
    Hon, WK
    Lam, TW
    Sung, WK
    THEORETICAL COMPUTER SCIENCE, 2006, 352 (1-3) : 240 - 249
  • [2] Approximate string matching using compressed suffix arrays
    Huynh, TND
    Hon, WK
    Lam, TW
    Sung, WK
    COMBINATORIAL PATTERN MATCHING, PROCEEDINGS, 2004, 3109 : 434 - 444
  • [3] Fast string searching with suffix trees
    Nelson, MR
    DR DOBBS JOURNAL, 1996, 21 (08): : 115 - 119
  • [4] Inducing enhanced suffix arrays for string collections
    Louza, Felipe A.
    Gog, Simon
    Telles, Guilherme P.
    THEORETICAL COMPUTER SCIENCE, 2017, 678 : 22 - 39
  • [5] Distributed multidimensional suffix arrays for string search
    Fellah, A
    Mawson, R
    2003 IEEE PACIFIC RIM CONFERENCE ON COMMUNICATIONS, COMPUTERS, AND SIGNAL PROCESSING, VOLS 1 AND 2, CONFERENCE PROCEEDINGS, 2003, : 792 - 795
  • [6] Compressed suffix arrays and suffix trees with applications to text indexing and string matching
    Grossi, R
    Vitter, JS
    SIAM JOURNAL ON COMPUTING, 2005, 35 (02) : 378 - 407
  • [7] gsufsort: constructing suffix arrays, LCP arrays and BWTs for string collections
    Felipe A. Louza
    Guilherme P. Telles
    Simon Gog
    Nicola Prezza
    Giovanna Rosone
    Algorithms for Molecular Biology, 15
  • [8] gsufsort: constructing suffix arrays, LCP arrays and BWTs for string collections
    Louza, Felipe A.
    Telles, Guilherme P.
    Gog, Simon
    Prezza, Nicola
    Rosone, Giovanna
    ALGORITHMS FOR MOLECULAR BIOLOGY, 2020, 15 (01)
  • [9] SUFFIX ARRAYS - A NEW METHOD FOR ONLINE STRING SEARCHES
    MANBER, U
    MYERS, G
    SIAM JOURNAL ON COMPUTING, 1993, 22 (05) : 935 - 948
  • [10] Fast mining frequent itemsets using Nodesets
    Deng, Zhi-Hong
    Lv, Sheng-Long
    EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (10) : 4505 - 4512