Fast frequent string mining using suffix arrays

被引:8
|
作者
Fischer, J [1 ]
Heun, V [1 ]
Kramer, S [1 ]
机构
[1] Univ Munich, Inst Informat, D-80333 Munich, Germany
来源
FIFTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS | 2005年
关键词
D O I
10.1109/ICDM.2005.62
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a method to mine strings that are frequent in one database and infrequent in another The method uses suffix- and lcp-arrays that can be computed extremely fast and space efficiently, and further exhibit a good locality behavior Experiments with several biologically relevant data sets show that our approach out performs existing inethods in terms of time and space.
引用
收藏
页码:609 / 612
页数:4
相关论文
共 50 条
  • [21] Fast mining maximum frequent itemsets
    Lu, S.F.
    Lu, Z.D.
    Ruan Jian Xue Bao/Journal of Software, 2001, 12 (02): : 293 - 297
  • [22] SuffixMiner: Efficiently mining frequent itemsets in data streams by suffix-forest
    Jia, LF
    Zhou, CG
    Wang, Z
    Xu, XJ
    FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, PT 2, PROCEEDINGS, 2005, 3614 : 592 - 595
  • [23] Suffix arrays
    Bentley, J
    DR DOBBS JOURNAL, 2001, 26 (04): : 145 - 147
  • [24] Fast algorithms for frequent itemset mining using FP-trees
    Grahne, G
    Zhu, JF
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (10) : 1347 - 1362
  • [25] Sliding Window Update Using Suffix Arrays
    Ferreira, Artur
    Oliveira, Arlindo
    Figueiredo, Mario
    2011 DATA COMPRESSION CONFERENCE (DCC), 2011, : 456 - 456
  • [26] Distributed query processing using suffix arrays
    Marín, M
    Navarro, G
    STRING PROCESSING AND INFORMATION RETRIEVAL, PROCEEDINGS, 2003, 2857 : 311 - 325
  • [27] A Compact RDF Store Using Suffix Arrays
    Brisaboa, Nieves R.
    Cerdeira-Pena, Ana
    Farina, Antonio
    Navarro, Gonzalo
    STRING PROCESSING AND INFORMATION RETRIEVAL (SPIRE 2015), 2015, 9309 : 103 - 115
  • [28] Distributed text search using suffix arrays
    Arroyuelo, Diego
    Bonacic, Carolina
    Gil-Costa, Veronica
    Marin, Mauricio
    Navarro, Gonzalo
    PARALLEL COMPUTING, 2014, 40 (09) : 471 - 495
  • [29] An improved fast algorithm of frequent string extracting with no thesaurus
    Zhang, Yumeng
    Liu, Chuanhan
    MICAI 2007: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2007, 4827 : 894 - +
  • [30] A bit-parallel approach to suffix automata: Fast extended string matching
    Navarro, G
    Raffinot, M
    COMBINATORIAL PATTERN MATCHING, 1998, 1448 : 14 - 33