Fast Algorithms for Top-k Approximate String Matching

被引:0
|
作者
Yang, Zhenglu [1 ]
Yu, Jianjun [2 ]
Kitsuregawa, Masaru [1 ]
机构
[1] Univ Tokyo, Inst Ind Sci, Tokyo 1138654, Japan
[2] Chinese Acad Sci, Comp Network Informat Ctr, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Top-k approximate querying on string collections is an important data analysis tool for many applications, and it has been exhaustively studied. However, the scale of the problem has increased dramatically because of the prevalence of the Web. In this paper, we aim to explore the efficient top-k similar string matching problem. Several efficient strategies are introduced, such as length aware and adaptive q-gram selection. We present a general q-gram based framework and propose two efficient algorithms based on the strategies introduced. Our techniques are experimentally evaluated on three real data sets and show a superior performance.
引用
收藏
页码:1467 / 1473
页数:7
相关论文
共 50 条
  • [21] Distributed Top-k subgraph matching
    Lan C.
    Zhang Y.
    Xing C.
    [J]. Xing, Chunxiao (xingcx@tsinghua.edu.cn), 1600, Tsinghua University (56): : 871 - 877
  • [22] Fast and practical approximate string matching
    BaezaYates, RA
    Perleberg, CH
    [J]. INFORMATION PROCESSING LETTERS, 1996, 59 (01) : 21 - 27
  • [23] FAST AND PRACTICAL APPROXIMATE STRING MATCHING
    BAEZAYATES, RA
    PERLEBERG, CH
    [J]. LECTURE NOTES IN COMPUTER SCIENCE, 1992, 644 : 185 - 192
  • [24] FAST APPROXIMATE STRING MATCHING.
    Owolabi, O.
    McGregor, D.R.
    [J]. Software - Practice and Experience, 1988, 18 (04) : 387 - 393
  • [25] Approximate top-k queries in sensor networks
    Patt-Shamir, Boaz
    Shafrir, Allon
    [J]. STRUCTURAL INFORMATION AND COMMUNICATION COMPLEXITY, PROCEEDINGS, 2006, 4056 : 319 - +
  • [26] Lightweight Approximate Top-k for Distributed Settings
    Deolalikar, Vinay
    Eshghi, Kave
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2014, : 835 - 844
  • [27] Efficient algorithms for approximate string matching with swaps
    Lee, JS
    Kim, DK
    Park, K
    Cho, Y
    [J]. COMBINATORIAL PATTERN MATCHING, PROCEEDINGS, 1997, 1264 : 28 - 39
  • [28] Efficient algorithms for approximate string matching with swaps
    Kim, DK
    Lee, JS
    Park, K
    Cho, Y
    [J]. JOURNAL OF COMPLEXITY, 1999, 15 (01) : 128 - 147
  • [29] Fast exact string matching algorithms
    Lecroq, Thierry
    [J]. INFORMATION PROCESSING LETTERS, 2007, 102 (06) : 229 - 235
  • [30] FAST PARALLEL AND SERIAL APPROXIMATE STRING MATCHING
    LANDAU, GM
    VISHKIN, U
    [J]. JOURNAL OF ALGORITHMS, 1989, 10 (02) : 157 - 169