A novel look-ahead optimization strategy for trie-based approximate string matching

被引:0
|
作者
Ghada Badr
B. John Oommen
机构
[1] Carleton University,School of Computer Science
来源
关键词
Trie-based syntactic pattern recognition; Approximate string matching; Noisy syntactic recognition using tries; Branch and bound techniques; Pruning;
D O I
暂无
中图分类号
学科分类号
摘要
This paper deals with the problem of estimating a transmitted string X* by processing the corresponding string Y, which is a noisy version of X*. We assume that Y contains substitution, insertion, and deletion errors, and that X* is an element of a finite (but possibly, large) dictionary, H. The best estimate X+ of X*, is defined as that element of H which minimizes the generalized Levenshtein distance D(X, Y) between X and Y such that the total number of errors is not more than K, for all X ∈H. The trie is a data structure that offers search costs that are independent of the document size. Tries also combine prefixes together, and so by using tries in approximate string matching we can utilize the information obtained in the process of evaluating any one D(Xi, Y), to compute any other D(Xj, Y), where Xi and Xj share a common prefix. In the artificial intelligence (AI) domain, branch and bound (BB) schemes are used when we want to prune paths that have costs above a certain threshold. These techniques have been applied to prune, for example, game trees. In this paper, we present a new BB pruning strategy that can be applied to dictionary-based approximate string matching when the dictionary is stored as a trie. The new strategy attempts to look ahead at each node, c, before moving further, by merely evaluating a certain local criterion at c. The search algorithm according to this pruning strategy will not traverse inside the subtrie(c) unless there is a “hope” of determining a suitable string in it. In other words, as opposed to the reported trie-based methods (Kashyap and Oommen in Inf Sci 23(2):123–142, 1981; Shang and Merrettal in IEEE Trans Knowledge Data Eng 8(4):540–547, 1996), the pruning is done a priori before even embarking on the edit distance computations. The new strategy depends highly on the variance of the lengths of the strings in H. It combines the advantages of partitioning the dictionary according to the string lengths, and the advantages gleaned by representing H using the trie data structure. The results demonstrate a marked improvement (up to 30% when costs are of a 0/1 form, and up to 47% when costs are general) with respect to the number of operations needed on three benchmark dictionaries.
引用
收藏
页码:177 / 187
页数:10
相关论文
共 50 条
  • [21] On the Value of Look-Ahead in Competitive Online Convex Optimization
    Shi M.
    Lin X.
    Jiao L.
    Performance Evaluation Review, 2019, 47 (01): : 33 - 34
  • [22] Dynamic Repair Scheduling for Transmission Systems Based on Look-Ahead Strategy Approximation
    Yan, Jiahao
    Hu, Bo
    Xie, Kaigui
    Niu, Tao
    Li, Chunyan
    Tai, Heng-Ming
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2021, 36 (04) : 2918 - 2933
  • [23] RAP-CLA: A Reconfigurable Approximate Carry Look-Ahead Adder
    Akbari, Omid
    Kamal, Mehdi
    Afzali-Kusha, Ali
    Pedram, Massoud
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2018, 65 (08) : 1089 - 1093
  • [24] A Look-Ahead Based Meta-heuristics for Optimizing Continuous Optimization Problems
    Nordli, Thomas
    Bouhmala, Noureddine
    OPTIMIZATION, LEARNING ALGORITHMS AND APPLICATIONS, OL2A 2021, 2021, 1488 : 48 - 55
  • [25] An Efficient Trie-based Method for Approximate Entity Extraction with Edit-Distance Constraints
    Deng, Dong
    Li, Guoliang
    Feng, Jianhua
    2012 IEEE 28TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2012, : 762 - 773
  • [26] Reinforcement Learning based Multi-Step Look-Ahead Bayesian Optimization
    Cheon, Mujin
    Byun, Haeun
    Lee, Jay H.
    IFAC PAPERSONLINE, 2022, 55 (07): : 100 - 105
  • [27] Look-ahead Horizon based Energy Optimization for Connected Hybrid Electric Vehicles
    Xu, Fuguo
    Shen, Tielong
    2020 IEEE 3RD CONNECTED AND AUTOMATED VEHICLES SYMPOSIUM (CAVS), 2020,
  • [28] Improving Query Focused Summarization Using Look-Ahead Strategy
    Badrinath, Rama
    Venkatasubramaniyan, Suresh
    Madhavan, C. E. Veni
    ADVANCES IN INFORMATION RETRIEVAL, 2011, 6611 : 641 - 652
  • [29] New insights into trait introgression with the look-ahead intercrossing strategy
    Ni, Zheng
    Moeinizade, Saba
    Kusmec, Aaron
    Hu, Guiping
    Wang, Lizhi
    Schnable, Patrick S.
    G3-GENES GENOMES GENETICS, 2023, 13 (04):
  • [30] Combining Regularization With Look-Ahead for Competitive Online Convex Optimization
    Shi, Ming
    Lin, Xiaojun
    Jiao, Lei
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2024, 32 (03) : 2391 - 2405