Flexible and Efficient Algorithms for Abelian Matching in Genome Sequence

被引：0

作者：

Faro, Simone ^{[1
]}

Pavone, Arianna ^{[2
]}

机构：

[1] Univ Catania, Dipartimento Matemat & Informat, Viale Andrea Doria 6, I-95125 Catania, Italy

[2] Univ Messina, Dipartimento Sci Cognit, Via Concez 6, I-98122 Messina, Italy

来源：

BIOINFORMATICS AND BIOMEDICAL ENGINEERING, IWBBIO 2019, PT I | 2019年 / 11465卷

关键词：

Approximate string matching; Abelian matching jumbled matching; Experimental algorithms;

D O I：

10.1007/978-3-030-17938-0_28

中图分类号：

Q5 [生物化学];

学科分类号：

071010 ; 081704 ;

摘要：

Approximate matching in strings is a fundamental and challenging problem in computer science and in computational biology, and increasingly fast algorithms are highly demanded in many applications including text processing and dna sequence analysis. Recently efficient solutions to specific approximate matching problems on genomic sequences have been designed using a filtering technique, based on the general abelian matching problem, which firstly locates the set of all candidate matching positions and then perform an additional verification test on the collected positions. The abelian pattern matching problem consists in finding all substrings of a text which are permutations of a given pattern. In this paper we present a new class of algorithms based on a new efficient fingerprint computation approach, called Heap-Counting, which turns out to be fast, flexible and easy to be implemented. We prove that, when applied for searching short patterns on a dna sequence, our solutions have a linear worst case time complexity. In addition we present an experimental evaluation which shows that our newly presented algorithms are among the most efficient and flexible solutions in practice for the abelian matching problem in dna sequences.

引用

页码：307 / 318

页数：12

共 50 条

[1] An efficient pattern matching algorithm for comparative genome sequence analysis
Ahmad, Muneer
Mathkour, Hassan
COMPUTATIONAL METHODS AND APPLIED COMPUTING, 2008, : 392 - 397
[2] Comparative genome sequence analysis by efficient pattern matching technique
Department of Computer Science, College of Computer and Information Sciences, King Saud University, P.O Box 51178, Riyadh 11543, Saudi Arabia
WSEAS Trans. Inf. Sci. Appl., 2008, 12 (1731-1740):
[3] SeqMatcher: efficient genome sequence matching with AVX-512 extensions
Espinosa, Elena
Quislant, Ricardo
Larrosa, Rafael
Plata, Oscar
JOURNAL OF SUPERCOMPUTING, 2025, 81 (01):
[4] Efficient algorithms for (δ, γ, α) and (δ, kΔ, α)-matching
Fredriksson, Kimmo
Grabowski, Szymon
INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE, 2008, 19 (01) : 163 - 183
[5] EFFICIENT ALGORITHMS FOR THE BASIS OF FINITE ABELIAN GROUPS
Karagiorgos, Gregory
Poulakis, Dimitrios
DISCRETE MATHEMATICS ALGORITHMS AND APPLICATIONS, 2011, 3 (04) : 537 - 552
[6] A Flexible Approach for Planning Schema Matching Algorithms
Duchateau, Fabien
Bellahsene, Zohra
Coletta, Remi
ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS: OTM 2008, PART I, 2008, 5331 : 249 - 264
[7] EFFICIENT SEQUENCE ALIGNMENT ALGORITHMS
WATERMAN, MS
JOURNAL OF THEORETICAL BIOLOGY, 1984, 108 (03) : 333 - 337
[8] Efficient algorithms for sequence segmentation
Terzi, Evimaria
Tsaparas, Panayiotis
PROCEEDINGS OF THE SIXTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2006, : 316 - 327
[9] EFFICIENT RANDOMIZED DICTIONARY MATCHING ALGORITHMS
AMIR, A
FARACH, M
MATIAS, Y
LECTURE NOTES IN COMPUTER SCIENCE, 1992, 644 : 262 - 275
[10] Genome sequence assembly: Algorithms and issues
Pop, M
Salzberg, SL
Shumway, M
COMPUTER, 2002, 35 (07) : 47 - +

← 1 2 3 4 5 →