Algorithms for extracting motifs from biological weighted sequences

被引:1
|
作者
Iliopoulos, C. [1 ]
Perdikuri, K. [2 ,3 ]
Theodoridis, E. [2 ,3 ]
Tsakalidis, A. [2 ,3 ]
Tsichlas, K. [1 ]
机构
[1] Kings Coll London, London WC2R 2LS, England
[2] Univ Patras, Comp Engn & Informat Dept, GR-26500 Patras, Greece
[3] Res Acad Comp Technol Inst RACTI, 61 Riga Feraiou Str, GR-26221 Patras 26221, Greece
关键词
Motif extraction; Biological weighted sequences;
D O I
10.1016/j.jda.2006.03.018
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In this paper we present three algorithms for the Motif Identification Problem in Biological Weighted Sequences. The first algorithm extracts repeated motifs from a biological weighted sequence. The motifs correspond to repetitive words which are approximately equal, under a Hamming distance, with probability of occurrence >= 1/k, where k is a small constant. The second algorithm extracts common motifs from a set of N >= 2 weighted sequences. In this case, the motifs consists of words that must occur with probability >= 1/k, in 1 <= q < N distinct sequences of the set. The third algorithm extracts maximal pairs from a biological weighted sequence. A pair in a sequence is the occurrence of the same word twice. In addition, the algorithms presented in this paper improve previous work on these problems. (C) 2006 Elsevier B.V. All rights reserved.
引用
收藏
页码:229 / 242
页数:14
相关论文
共 50 条
  • [41] Detecting Motifs in DNA Sequences by Branching from Neighbors of Qualified Potential Motifs
    Song, Tao
    Wang, Xun
    Zhang, Zhujin
    Hong, Liu
    JOURNAL OF COMPUTATIONAL AND THEORETICAL NANOSCIENCE, 2013, 10 (09) : 2201 - 2206
  • [42] Editorial: repetitive Structures in Biological Sequences: algorithms and applications
    Pellegrini, Marco
    Magi, Alberto
    Iliopoulos, Costas S.
    FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY, 2016, 4
  • [43] Simple and Efficient Pattern Matching Algorithms for Biological Sequences
    Neamatollahi, Peyman
    Hadi, Montassir
    Naghibzadeh, Mahmoud
    IEEE ACCESS, 2020, 8 (08): : 23838 - 23846
  • [44] A survey on improving pattern matching algorithms for biological sequences
    Hamed, Belal A.
    Ibrahim, Osman Ali Sadek
    Abd El-Hafeez, Tarek
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (26):
  • [45] Comparison of exact string matching algorithms for biological sequences
    Kalsi, Petri
    Peltola, Hannu
    Tarhio, Jorma
    BIOINFORMATICS RESEARCH AND DEVELOPMENT, PROCEEDINGS, 2008, 13 : 417 - 426
  • [46] WildSpan: mining structured motifs from protein sequences
    Chen-Ming Hsu
    Chien-Yu Chen
    Baw-Jhiune Liu
    Algorithms for Molecular Biology, 6
  • [47] Algorithms for extracting structured motifs using a suffix tree with an application to promoter and regulatory site consensus identification
    Marsan, L
    Sagot, MF
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2000, 7 (3-4) : 345 - 362
  • [48] WildSpan: mining structured motifs from protein sequences
    Hsu, Chen-Ming
    Chen, Chien-Yu
    Liu, Baw-Jhiune
    ALGORITHMS FOR MOLECULAR BIOLOGY, 2011, 6
  • [49] Motif extraction from weighted sequences
    Iliopoulos, CS
    Perdikuri, K
    Theodoridis, E
    Tsakalidis, A
    Tsichlas, K
    STRING PROCESSING AND INFORMATION RETRIEVAL, PROCEEDINGS, 2004, 3246 : 286 - 297
  • [50] New techniques for extracting features from protein sequences
    Wang, JTL
    Ma, Q
    Shasha, D
    Wu, CH
    IBM SYSTEMS JOURNAL, 2001, 40 (02) : 426 - 441