Searching for supermaximal repeats in large DNA sequences

被引:0
|
作者
Lian, Chen Na [1 ]
Halachev, Mihail [1 ]
Shiri, Nematollaah [1 ]
机构
[1] Concordia Univ, Dept Comp Sci & Software Engn, Montreal, PQ, Canada
关键词
DNA sequences; supermaximal repeats; suffix tree; performance;
D O I
暂无
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
We study the problem of finding supermaximal repeats in large DNA sequences. For this, we propose an algorithm called SMR which uses an auxiliary index structure (POL), which is derived from and replaces the suffix tree index ST-FD64 [1]. The results of our numerous experiments using the 24 human chromosomes data indicate that SMR outperforms the solution provided as part of the Vmatch [2] software tool. In searching for supermaximal repeats of size at least 10 bases, SMR is twice faster than Vmatch; for a minimum length of 25 bases, SMR is 7 times faster; and for repeats of length at least 200, SMR is about 9 times faster. We also study the cost of POL in terms of time and space requirements.
引用
收藏
页码:87 / 101
页数:15
相关论文
共 50 条
  • [31] Some probabilistic results on the nonrandomness of simple sequence repeats in DNA sequences
    Ndifon, Wilfred
    Nkwanta, Asamoah
    Hill, Dwayne
    BULLETIN OF MATHEMATICAL BIOLOGY, 2006, 68 (07) : 1747 - 1759
  • [32] Expansion of tandem repeats and oligomer clustering in coding and noncoding DNA sequences
    Buldyrev, SV
    Dokholyan, NV
    Havlin, S
    Stanley, HE
    Stanley, RHR
    PHYSICA A, 1999, 273 (1-2): : 19 - 32
  • [33] STEPSTONE: A program to detect inter-spread repeats in DNA sequences
    Murakami, H
    Sugaya, N
    Sato, M
    Imaizumi, A
    Aburatani, S
    Horimoto, K
    8TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL VII, PROCEEDINGS: APPLICATIONS OF INFORMATICS AND CYBERNETICS IN SCIENCE AND ENGINEERING, 2004, : 12 - 18
  • [34] Identifying nonrandom occurrences of simple sequence repeats in genomic DNA sequences
    Ndifon, W
    Nkwanta, A
    Hill, D
    ETHNICITY & DISEASE, 2005, 15 (04) : S67 - S70
  • [35] Detection of Tandem Repeats in DNA Sequences Based on Parametric Spectral Estimation
    Zhou, Hongxia
    Du, Liping
    Yan, Hong
    IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE, 2009, 13 (05): : 747 - 755
  • [36] ANALYSIS OF THE HUMAN DNA-SEQUENCES CONTAINING ALU-REPEATS
    KOROTKOV, EV
    KOROTKOVA, MA
    DOKLADY AKADEMII NAUK SSSR, 1986, 288 (04): : 1014 - 1017
  • [37] Power law distribution of dimeric tandem repeats in DNA sequences.
    Dokholyan, N
    Buldyrev, S
    Havlin, S
    Stanley, HE
    PHYSICS OF COMPLEX SYSTEMS, 1997, 134 : 742 - 742
  • [38] Autoregressive models for spectral analysis of short tandem repeats in DNA sequences
    Hongxia Zhou
    Hong Yan
    2006 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-6, PROCEEDINGS, 2006, : 1286 - +
  • [39] REPETITIVE DNA-SEQUENCES - SOME CONSIDERATIONS FOR SIMPLE SEQUENCE REPEATS
    BELL, GI
    TORNEY, DC
    COMPUTERS & CHEMISTRY, 1993, 17 (02): : 185 - 190
  • [40] Some Probabilistic Results on the Nonrandomness of Simple Sequence Repeats in DNA Sequences
    Wilfred Ndifon
    Asamoah Nkwanta
    Dwayne Hill
    Bulletin of Mathematical Biology, 2006, 68 : 1747 - 1759