Boosting exact pattern matching with extreme gradient boosting (and more)

被引:0
|
作者
Susik, Robert [1 ]
Grabowski, Szymon [1 ]
机构
[1] Lodz Univ Technol, Inst Appl Comp Sci, Lodz, Poland
来源
JOURNAL OF SUPERCOMPUTING | 2025年 / 81卷 / 05期
关键词
Text matching; Algorithm selection; Machine learning; Gradient boosting; Pattern matching;
D O I
10.1007/s11227-025-07165-2
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Pattern matching is a well-known problem in computer science. Over the years, dozens of exact pattern matching algorithms have been developed. Clearly, search speed is usually the most important aspect, but it is difficult to tell which algorithm is fastest for a specific (given) pattern. Most applications, programming languages, and domain-specific tools maintain a single algorithm for exact pattern matching that may not be the best choice for all use cases. The key finding of this study is that the pattern itself contains information about which algorithm should be used to search for it. We take advantage of this fact to develop a solution that enables faster pattern searching by leveraging machine learning models to select the best-performing algorithm for a given pattern. The selection method uses machine learning models such as Random Forest, Extra Trees, AdaBoost, Bootstrap Aggregation, and Gradient Boosting. The proposed solution is online, i.e., does not require prior reading of the text and is based on the information extracted from the pattern. Experiments show that it is 11% faster than the fastest (on average) exact pattern matching algorithm.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Anticipating bank distress in the Eurozone: An Extreme Gradient Boosting approach
    Climent, Francisco
    Momparler, Alexandre
    Carmona, Pedro
    JOURNAL OF BUSINESS RESEARCH, 2019, 101 : 885 - 896
  • [22] Nuclear charge radius predictions based on eXtreme Gradient Boosting
    Li, Weifeng
    Zhang, Xiaoyan
    Fang, Jiyu
    PHYSICA SCRIPTA, 2024, 99 (04)
  • [23] Gradient boosting with extreme-value theory for wildfire prediction
    Koh, Jonathan
    EXTREMES, 2023, 26 (02) : 273 - 299
  • [24] Power Grid Stability Identification Based on eXtreme Gradient Boosting
    Shan, Jinning
    Li, Zhengwen
    Zhao, Peng
    Wang, Chenqi
    Wang, Xin
    2019 6TH INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE 2019), 2019, : 803 - 809
  • [25] Extreme Learning Machine Enhanced Gradient Boosting for Credit Scoring
    Zou, Yao
    Gao, Changchun
    ALGORITHMS, 2022, 15 (05)
  • [26] Automatic detection of seismic event based on eXtreme gradient boosting
    Huang J.
    Zhang R.
    Gao R.
    Li Y.
    Duan W.
    Chen F.
    Guo T.
    Pan C.
    Zhongguo Shiyou Daxue Xuebao (Ziran Kexue Ban)/Journal of China University of Petroleum (Edition of Natural Science), 2024, 48 (03): : 44 - 56
  • [27] Interpretable credit scoring based on an additive extreme gradient boosting
    Zou, Yao
    Xia, Meng
    Lan, Xingyu
    CHAOS SOLITONS & FRACTALS, 2025, 194
  • [28] Extreme Gradient Boosting Regression Model for Soil Available Boron
    Gokmen, F.
    Uygur, V.
    Sukusu, E.
    EURASIAN SOIL SCIENCE, 2023, 56 (06) : 738 - 746
  • [29] Predicting energy use in construction using Extreme Gradient Boosting
    Han, Jiaming
    Shu, Kunxin
    Wang, Zhenyu
    PEERJ COMPUTER SCIENCE, 2023, 9
  • [30] Gradient boosting with extreme-value theory for wildfire prediction
    Jonathan Koh
    Extremes, 2023, 26 : 273 - 299