Analysis and safety engineering of fuzzy string matching algorithms

被引:3
|
作者
Pikies, Malgorzata [1 ]
Ali, Junade [1 ]
机构
[1] Cloudflare, London, England
关键词
String similarity; Fuzzy string matching; Safety engineering; Natural language processing; Binary classification; Neural network;
D O I
10.1016/j.isatra.2020.10.014
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper we explore fuzzy string matching in an automatic ticket classification and processing system. We compare performance of the following string similarity algorithms: Longest Common Subsequence (LCS), Dice coefficient, Cosine Similarity, Levenshtein (edit) distance and Damerau distance. Through optimisation, we accomplished a 15% improvement in the ratio of false positives to true positive classifications over the existing approach used by a customer support system for free customers. To introduce greater safety; we compliment fuzzy string matching algorithms with a second layer Convolutional Neural Network (CNN) binary classifier, achieving an improved keyword classification ratio for two ticket categories by a relative 69% and 78%. Such an approach allows for classification to only be applied where a desired level of safety achieved, such as in instances where automated answers. (C) 2020 ISA. Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:1 / 8
页数:8
相关论文
共 50 条
  • [1] Fuzzy String Matching Using Sentence Embedding Algorithms
    Rong, Yu
    Hu, Xiaolin
    [J]. NEURAL INFORMATION PROCESSING, ICONIP 2016, PT III, 2016, 9949 : 626 - 633
  • [2] ALGORITHMS FOR APPROXIMATE STRING MATCHING
    UKKONEN, E
    [J]. INFORMATION AND CONTROL, 1985, 64 (1-3): : 100 - 118
  • [3] A Method for Fuzzy String Matching
    Wu, Wen-Yen
    [J]. 2016 INTERNATIONAL COMPUTER SYMPOSIUM (ICS), 2016, : 380 - 383
  • [4] THE ACCURACY OF APPROXIMATE STRING MATCHING ALGORITHMS
    NESBIT, JC
    [J]. JOURNAL OF COMPUTER-BASED INSTRUCTION, 1986, 13 (03): : 80 - 83
  • [5] The String Matching Algorithms Research Tool
    Faro, Simone
    Lecroq, Thierry
    Borzi, Stefano
    Di Mauro, Simone
    Maggio, Alessandro
    [J]. PROCEEDINGS OF THE PRAGUE STRINGOLOGY CONFERENCE 2016, 2016, : 99 - 113
  • [6] A comparison of approximate string matching algorithms
    Jokinen, P
    Tarhio, J
    Ukkonen, E
    [J]. SOFTWARE-PRACTICE & EXPERIENCE, 1996, 26 (12): : 1439 - 1458
  • [7] A COMPARISON OF 3 STRING MATCHING ALGORITHMS
    DEVONSMIT, G
    [J]. SOFTWARE-PRACTICE & EXPERIENCE, 1982, 12 (01): : 57 - 66
  • [8] Fast exact string matching algorithms
    Lecroq, Thierry
    [J]. INFORMATION PROCESSING LETTERS, 2007, 102 (06) : 229 - 235
  • [9] A review on parameterized string matching algorithms
    Singh, Rama
    Rai, Deepak
    Prasad, Rajesh
    [J]. JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES, 2018, 39 (01):
  • [10] OPTIMAL PARALLEL ALGORITHMS FOR STRING MATCHING
    GALIL, Z
    [J]. INFORMATION AND CONTROL, 1985, 67 (1-3): : 144 - 157