Longest Common Substring with Approximately k Mismatches

被引:9
|
作者
Kociumaka, Tomasz [1 ]
Radoszewski, Jakub [1 ]
Starikovskaya, Tatiana [2 ]
机构
[1] Univ Warsaw, Inst Informat, Warsaw, Poland
[2] PSL Univ, Ecole Normale Super, DIENS, Paris, France
关键词
Randomised algorithms; String similarity measures; Longest common substring; Sketching; Locality-sensitive hashing; Binary jumbled indexing;
D O I
10.1007/s00453-019-00548-x
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In the longest common substring problem, we are given two strings of length n and must find a substring of maximal length that occurs in both strings. It is well known that the problem can be solved in linear time, but the solution is not robust and can vary greatly when the input strings are changed even by one character. To circumvent this, Leimeister and Morgenstern introduced the problem of the longest common substring with k mismatches. Lately, this problem has received a lot of attention in the literature. In this paper, we first show a conditional lower bound based on the SETH hypothesis implying that there is little hope to improve existing solutions. We then introduce a new but closely related problem of the longest common substring with approximately k mismatches and use locality-sensitive hashing to show that it admits a solution with strongly subquadratic running time. We also apply these results to obtain a strongly subquadratic-time 2-approximation algorithm for the longest common substring with k mismatches problem and show conditional hardness of improving its approximation ratio.
引用
收藏
页码:2633 / 2652
页数:20
相关论文
共 50 条
  • [1] Longest Common Substring with Approximately k Mismatches
    Tomasz Kociumaka
    Jakub Radoszewski
    Tatiana Starikovskaya
    Algorithmica, 2019, 81 : 2633 - 2652
  • [2] Correction to: Longest Common Substring with Approximately k Mismatches
    Tomasz Kociumaka
    Jakub Radoszewski
    Tatiana Starikovskaya
    Algorithmica, 2019, 81 : 3074 - 3074
  • [3] Publisher Correction: Longest Common Substring with Approximately k Mismatches
    Tomasz Kociumaka
    Jakub Radoszewski
    Tatiana Starikovskaya
    Algorithmica, 2023, 85 : 3323 - 3323
  • [4] Longest Common Substring with Approximately k Mismatches (vol 81, pg 2633, 2019)
    Kociumaka, Tomasz
    Radoszewski, Jakub
    Starikovskaya, Tatiana
    ALGORITHMICA, 2023, 85 (10) : 3323 - 3323
  • [5] Longest Common Substring with Approximately k Mismatches (vol 81, pg 2633, 2019)
    Kociumaka, Tomasz
    Radoszewski, Jakub
    Starikovskaya, Tatiana
    ALGORITHMICA, 2019, 81 (07) : 3074 - 3074
  • [7] Longest common substrings with k mismatches
    Flouri, Tomas
    Giaquinta, Emanuele
    Kobert, Kassian
    Ukkonen, Esko
    INFORMATION PROCESSING LETTERS, 2015, 115 (6-8) : 643 - 647
  • [8] The longest common substring problem
    Crochemore, Maxime
    Iliopoulos, Costas S.
    Langiu, Alessio
    Mignosi, Filippo
    MATHEMATICAL STRUCTURES IN COMPUTER SCIENCE, 2017, 27 (02) : 277 - 295
  • [9] Longest Common Prefixes with k-Mismatches and Applications
    Alamro, Hayam
    Ayad, Lorraine A. K.
    Charalampopoulos, Panagiotis
    Iliopoulos, Costas S.
    Pissis, Solon P.
    SOFSEM 2018: THEORY AND PRACTICE OF COMPUTER SCIENCE, 2018, 10706 : 636 - 649
  • [10] On the Longest Common Cartesian Substring Problem†
    Faro, Simone
    Lecroq, Thierry
    Park, Kunsoo
    Scafiti, Stefano
    COMPUTER JOURNAL, 2023, 66 (04): : 907 - 923