Flexible and Efficient Algorithms for Abelian Matching in Genome Sequence

被引:0
|
作者
Faro, Simone [1 ]
Pavone, Arianna [2 ]
机构
[1] Univ Catania, Dipartimento Matemat & Informat, Viale Andrea Doria 6, I-95125 Catania, Italy
[2] Univ Messina, Dipartimento Sci Cognit, Via Concez 6, I-98122 Messina, Italy
关键词
Approximate string matching; Abelian matching jumbled matching; Experimental algorithms;
D O I
10.1007/978-3-030-17938-0_28
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Approximate matching in strings is a fundamental and challenging problem in computer science and in computational biology, and increasingly fast algorithms are highly demanded in many applications including text processing and dna sequence analysis. Recently efficient solutions to specific approximate matching problems on genomic sequences have been designed using a filtering technique, based on the general abelian matching problem, which firstly locates the set of all candidate matching positions and then perform an additional verification test on the collected positions. The abelian pattern matching problem consists in finding all substrings of a text which are permutations of a given pattern. In this paper we present a new class of algorithms based on a new efficient fingerprint computation approach, called Heap-Counting, which turns out to be fast, flexible and easy to be implemented. We prove that, when applied for searching short patterns on a dna sequence, our solutions have a linear worst case time complexity. In addition we present an experimental evaluation which shows that our newly presented algorithms are among the most efficient and flexible solutions in practice for the abelian matching problem in dna sequences.
引用
收藏
页码:307 / 318
页数:12
相关论文
共 50 条
  • [1] An efficient pattern matching algorithm for comparative genome sequence analysis
    Ahmad, Muneer
    Mathkour, Hassan
    COMPUTATIONAL METHODS AND APPLIED COMPUTING, 2008, : 392 - 397
  • [2] Comparative genome sequence analysis by efficient pattern matching technique
    Department of Computer Science, College of Computer and Information Sciences, King Saud University, P.O Box 51178, Riyadh 11543, Saudi Arabia
    WSEAS Trans. Inf. Sci. Appl., 2008, 12 (1731-1740):
  • [3] SeqMatcher: efficient genome sequence matching with AVX-512 extensions
    Espinosa, Elena
    Quislant, Ricardo
    Larrosa, Rafael
    Plata, Oscar
    JOURNAL OF SUPERCOMPUTING, 2025, 81 (01):
  • [4] Efficient algorithms for (δ, γ, α) and (δ, kΔ, α)-matching
    Fredriksson, Kimmo
    Grabowski, Szymon
    INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE, 2008, 19 (01) : 163 - 183
  • [5] EFFICIENT ALGORITHMS FOR THE BASIS OF FINITE ABELIAN GROUPS
    Karagiorgos, Gregory
    Poulakis, Dimitrios
    DISCRETE MATHEMATICS ALGORITHMS AND APPLICATIONS, 2011, 3 (04) : 537 - 552
  • [6] A Flexible Approach for Planning Schema Matching Algorithms
    Duchateau, Fabien
    Bellahsene, Zohra
    Coletta, Remi
    ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS: OTM 2008, PART I, 2008, 5331 : 249 - 264
  • [7] EFFICIENT SEQUENCE ALIGNMENT ALGORITHMS
    WATERMAN, MS
    JOURNAL OF THEORETICAL BIOLOGY, 1984, 108 (03) : 333 - 337
  • [8] Efficient algorithms for sequence segmentation
    Terzi, Evimaria
    Tsaparas, Panayiotis
    PROCEEDINGS OF THE SIXTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2006, : 316 - 327
  • [9] EFFICIENT RANDOMIZED DICTIONARY MATCHING ALGORITHMS
    AMIR, A
    FARACH, M
    MATIAS, Y
    LECTURE NOTES IN COMPUTER SCIENCE, 1992, 644 : 262 - 275
  • [10] Genome sequence assembly: Algorithms and issues
    Pop, M
    Salzberg, SL
    Shumway, M
    COMPUTER, 2002, 35 (07) : 47 - +