Boosting exact pattern matching with extreme gradient boosting (and more)

被引:0
|
作者
Susik, Robert [1 ]
Grabowski, Szymon [1 ]
机构
[1] Lodz Univ Technol, Inst Appl Comp Sci, Lodz, Poland
来源
JOURNAL OF SUPERCOMPUTING | 2025年 / 81卷 / 05期
关键词
Text matching; Algorithm selection; Machine learning; Gradient boosting; Pattern matching;
D O I
10.1007/s11227-025-07165-2
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Pattern matching is a well-known problem in computer science. Over the years, dozens of exact pattern matching algorithms have been developed. Clearly, search speed is usually the most important aspect, but it is difficult to tell which algorithm is fastest for a specific (given) pattern. Most applications, programming languages, and domain-specific tools maintain a single algorithm for exact pattern matching that may not be the best choice for all use cases. The key finding of this study is that the pattern itself contains information about which algorithm should be used to search for it. We take advantage of this fact to develop a solution that enables faster pattern searching by leveraging machine learning models to select the best-performing algorithm for a given pattern. The selection method uses machine learning models such as Random Forest, Extra Trees, AdaBoost, Bootstrap Aggregation, and Gradient Boosting. The proposed solution is online, i.e., does not require prior reading of the text and is based on the information extracted from the pattern. Experiments show that it is 11% faster than the fastest (on average) exact pattern matching algorithm.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Extreme Gradient Boosting Regressor Solution for Defy in Drilling of Materials
    Elango, Sangeetha
    Natarajan, Elango
    Varadaraju, Kaviarasan
    Abraham Gnanamuthu, Ezra Morris
    Durairaj, R.
    Mohanraj, Karthikeyan
    Osman, M. A.
    ADVANCES IN MATERIALS SCIENCE AND ENGINEERING, 2022, 2022
  • [42] Predicting the cytotoxicity of nanomaterials through explainable, extreme gradient boosting
    Conti, Allegra
    Campagnolo, Luisa
    Diciotti, Stefano
    Pietroiusti, Antonio
    Toschi, Nicola
    NANOTOXICOLOGY, 2022, 16 (9-10) : 844 - 856
  • [43] Agricultural crop product pattern detection using optical and radar images with extreme gradient boosting algorithm
    Simsek, Fatih Fehmi
    GEOMATIK, 2024, 9 (01): : 54 - 68
  • [44] EXtreme gradient boosting Mix-Fit-Standalone: A high-performance gradient boosting tree framework for Vertical federated learning
    Xie, Shimao
    Che, Yun
    Chen, Niannian
    Mao, Hang
    Chen, Sichen
    NEUROCOMPUTING, 2025, 624
  • [45] Stochastic gradient boosting
    Friedman, JH
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2002, 38 (04) : 367 - 378
  • [46] Accelerated gradient boosting
    Biau, G.
    Cadre, B.
    Rouviere, L.
    MACHINE LEARNING, 2019, 108 (06) : 971 - 992
  • [47] Accelerated gradient boosting
    G. Biau
    B. Cadre
    L. Rouvière
    Machine Learning, 2019, 108 : 971 - 992
  • [48] Online Gradient Boosting
    Beygelzimer, Alina
    Hazan, Elad
    Kale, Satyen
    Luo, Haipeng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [49] Infinitesimal gradient boosting
    Dombry, Clement
    Duchamps, Jean-Jil
    STOCHASTIC PROCESSES AND THEIR APPLICATIONS, 2024, 170
  • [50] Regularized Gradient Boosting
    Cortes, Corinna
    Mohri, Mehryar
    Storcheus, Dmitry
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32