Design of Fast Multiple String Searching Based on Improved Prefix Tree

被引:1
|
作者
Cheng, Yu [1 ]
Zhang, Tao [2 ]
机构
[1] Tsinghua Univ, Dept Biomed Engn, Beijing 100084, Peoples R China
[2] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China
关键词
Multi-string matching; prefix tree; string pattern;
D O I
10.1109/WKDD.2010.138
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-string matching is one of the most important components in data mining task. New applications in many technology fields require high performance string matching algorithms. This paper first presents a new string searching approach based on a data structure called prefix tree. The innovative algorithm eliminates the functional overlap of the table HASH and Prefix Function. Then we make a little improvement on the prefix tree and present a second algorithm that is faster and more space-saving. It is demonstrated analytically that the two algorithms inherit the optimality and are very competitive in practice. On tests of both real life and synthetic data, our algorithms are also efficient and especially effective for various string pattern and large alphabet sets.
引用
收藏
页码:111 / 114
页数:4
相关论文
共 50 条
  • [31] Mining Sequential Rules Based on Prefix-Tree
    Thien-Trang Van
    Bay Vo
    Bac Le
    NEW CHALLENGES FOR INTELLIGENT INFORMATION AND DATABASE SYSTEMS, 2011, 351 : 147 - +
  • [32] Fast string searching in secondary storage: Theoretical developments and experimental results
    Ferragina, P
    Grossi, R
    PROCEEDINGS OF THE SEVENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 1996, : 373 - 382
  • [33] FASTPAT - A FAST AND EFFICIENT ALGORITHM FOR STRING SEARCHING IN DNA-SEQUENCES
    PRUNELLA, N
    LIUNI, S
    ATTIMONELLI, M
    PESOLE, G
    COMPUTER APPLICATIONS IN THE BIOSCIENCES, 1993, 9 (05): : 541 - 545
  • [34] Improved single and multiple approximate string matching
    Fredriksson, K
    Navarro, G
    COMBINATORIAL PATTERN MATCHING, PROCEEDINGS, 2004, 3109 : 457 - 471
  • [35] An improved location difference of multiple distances based nearest neighbors searching algorithm
    Yang, Liu
    Dong, Limei
    Bi, Xiaoru
    OPTIK, 2016, 127 (22): : 10838 - 10843
  • [36] An improved fast algorithm of frequent string extracting with no thesaurus
    Zhang, Yumeng
    Liu, Chuanhan
    MICAI 2007: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2007, 4827 : 894 - +
  • [37] Efficient frequent pattern mining based on Linear Prefix tree
    Pyun, Gwangbum
    Yun, Unil
    Ryu, Keun Ho
    KNOWLEDGE-BASED SYSTEMS, 2014, 55 : 125 - 139
  • [38] Self-Stabilizing Prefix Tree Based Overlay Networks
    Caron, Eddy
    Datta, Ajoy K.
    Petit, Franck
    Tedeschi, Cedric
    INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE, 2016, 27 (05) : 607 - 630
  • [39] Improved prefix for OFDM-based cognitive radios
    Cooklev, T.
    ELECTRONICS LETTERS, 2012, 48 (04) : 240 - U186
  • [40] String tension and thermodynamics with tree level and tadpole improved actions
    Beinlich, B
    Karsch, F
    Laermann, E
    Peikert, A
    NUCLEAR PHYSICS B-PROCEEDINGS SUPPLEMENTS, 1998, 63 : 922 - 924