Flexible and Efficient Algorithms for Abelian Matching in Genome Sequence

被引:0
|
作者
Faro, Simone [1 ]
Pavone, Arianna [2 ]
机构
[1] Univ Catania, Dipartimento Matemat & Informat, Viale Andrea Doria 6, I-95125 Catania, Italy
[2] Univ Messina, Dipartimento Sci Cognit, Via Concez 6, I-98122 Messina, Italy
关键词
Approximate string matching; Abelian matching jumbled matching; Experimental algorithms;
D O I
10.1007/978-3-030-17938-0_28
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Approximate matching in strings is a fundamental and challenging problem in computer science and in computational biology, and increasingly fast algorithms are highly demanded in many applications including text processing and dna sequence analysis. Recently efficient solutions to specific approximate matching problems on genomic sequences have been designed using a filtering technique, based on the general abelian matching problem, which firstly locates the set of all candidate matching positions and then perform an additional verification test on the collected positions. The abelian pattern matching problem consists in finding all substrings of a text which are permutations of a given pattern. In this paper we present a new class of algorithms based on a new efficient fingerprint computation approach, called Heap-Counting, which turns out to be fast, flexible and easy to be implemented. We prove that, when applied for searching short patterns on a dna sequence, our solutions have a linear worst case time complexity. In addition we present an experimental evaluation which shows that our newly presented algorithms are among the most efficient and flexible solutions in practice for the abelian matching problem in dna sequences.
引用
收藏
页码:307 / 318
页数:12
相关论文
共 50 条
  • [41] Efficient computational algorithms for docking and for generating and matching a library of functional epitopes - I. Rigid and flexible hinge-bending docking algorithms
    Nussinov, R
    Wolfson, HJ
    COMBINATORIAL CHEMISTRY & HIGH THROUGHPUT SCREENING, 1999, 2 (05) : 249 - 259
  • [42] OBJECT RECOGNITION BY FLEXIBLE TEMPLATE MATCHING USING GENETIC ALGORITHMS
    HILL, A
    TAYLOR, CJ
    COOTES, T
    LECTURE NOTES IN COMPUTER SCIENCE, 1992, 588 : 852 - 856
  • [43] An Efficient DNA Sequence Compression using Small Sequence Pattern Matching
    Murugan, A.
    Punitha, K.
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2021, 21 (08): : 281 - 287
  • [44] Matching algorithms for assigning orthologs after genome duplication events
    Fertin, Guillaume
    Hueffner, Falk
    Komusiewicz, Christian
    Sorge, Manuel
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2018, 74 : 379 - 390
  • [45] Genome sequence assembly algorithms and misassembly identification methods
    Meng, Yue
    Lei, Yu
    Gao, Jianlong
    Liu, Yuxuan
    Ma, Enze
    Ding, Yunhong
    Bian, Yixin
    Zu, Hongquan
    Dong, Yucui
    Zhu, Xiao
    MOLECULAR BIOLOGY REPORTS, 2022, 49 (11) : 11133 - 11148
  • [46] Simple and Efficient Pattern Matching Algorithms for Biological Sequences
    Neamatollahi, Peyman
    Hadi, Montassir
    Naghibzadeh, Mahmoud
    IEEE ACCESS, 2020, 8 (08): : 23838 - 23846
  • [47] Efficient Algorithms for Sequence Analysis with Entropic Profiles
    Pizzi, Cinzia
    Ornamenti, Mattia
    Spangaro, Simone
    Rombo, Simona E.
    Parida, Laxmi
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018, 15 (01) : 117 - 128
  • [48] Seeded Graph Matching: Efficient Algorithms and Theoretical Guarantees
    Shirani, Farhad
    Garg, Siddharth
    Erkip, Elza
    2017 FIFTY-FIRST ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS, 2017, : 253 - 257
  • [49] EFFICIENT LABELING ALGORITHMS FOR THE MAXIMUM NONCROSSING MATCHING PROBLEM
    MALUCELLI, F
    OTTMANN, T
    PRETOLANI, D
    DISCRETE APPLIED MATHEMATICS, 1993, 47 (02) : 175 - 179
  • [50] Efficient approximation algorithms for the maximum weight matching problem
    Takafuji, D
    Taoka, S
    Watanabe, T
    2002 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL IV, PROCEEDINGS, 2002, : 457 - 460