Searching for supermaximal repeats in large DNA sequences

被引:0
|
作者
Lian, Chen Na [1 ]
Halachev, Mihail [1 ]
Shiri, Nematollaah [1 ]
机构
[1] Concordia Univ, Dept Comp Sci & Software Engn, Montreal, PQ, Canada
关键词
DNA sequences; supermaximal repeats; suffix tree; performance;
D O I
暂无
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
We study the problem of finding supermaximal repeats in large DNA sequences. For this, we propose an algorithm called SMR which uses an auxiliary index structure (POL), which is derived from and replaces the suffix tree index ST-FD64 [1]. The results of our numerous experiments using the 24 human chromosomes data indicate that SMR outperforms the solution provided as part of the Vmatch [2] software tool. In searching for supermaximal repeats of size at least 10 bases, SMR is twice faster than Vmatch; for a minimum length of 25 bases, SMR is 7 times faster; and for repeats of length at least 200, SMR is about 9 times faster. We also study the cost of POL in terms of time and space requirements.
引用
收藏
页码:87 / 101
页数:15
相关论文
共 50 条
  • [21] Identification of repeats in DNA sequences using nucleotide distribution uniformity
    Yin, Changchuan
    JOURNAL OF THEORETICAL BIOLOGY, 2017, 412 : 138 - 145
  • [22] RAPID ISOLATION OF DNA-SEQUENCES FLANKING MICROSATELLITE REPEATS
    ROWE, PSN
    FRANCIS, F
    GOULDING, J
    NUCLEIC ACIDS RESEARCH, 1994, 22 (23) : 5135 - 5136
  • [23] Efficient Search of Circular Repeats and MicroDNA Reintegration in DNA Sequences
    Wang, Yiming
    Lou, Hao
    Kumar, Pankaj
    Dutta, Anindya
    Farnoud, Farzad
    2020 IEEE 20TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE 2020), 2020, : 89 - 96
  • [24] MGWT based Algorithm for Tandem Repeats Detection in DNA Sequences
    Garg, Pardeep
    Sharma, SunilDatt
    PROCEEDINGS OF 2019 5TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMPUTING AND CONTROL (ISPCC 2K19), 2019, : 196 - 199
  • [25] Distribution of base pair repeats in coding and noncoding DNA sequences
    Dokholyan, NV
    Buldyrev, SV
    Havlin, S
    Stanley, HE
    PHYSICAL REVIEW LETTERS, 1997, 79 (25) : 5182 - 5185
  • [26] Chromosome banding and the large scale organization of repetitive DNA sequences including simple sequence repeats in triticeae cereals
    Schwarzacher, T
    Cuadrado, A
    Gillies, CB
    Vershinin, AV
    Heslop-Harrison, JS
    CYTOGENETICS AND CELL GENETICS, 1998, 81 (02): : 113 - 114
  • [27] Searching DNA databases for similarities to DNA sequences: when is a match significant?
    Anderson, I
    Brass, A
    BIOINFORMATICS, 1998, 14 (04) : 349 - 356
  • [28] Similarity searching in DNA sequences by spectral distortion measures
    Pham, Tuan D.
    ADVANCES IN DATA MINING: APPLICATIONS IN MEDICINE, WEB MINING, MARKETING, IMAGE AND SIGNAL MINING, 2006, 4065 : 24 - 37
  • [29] Pasting in large DNA sequences
    Crunkhorn, Sarah
    NATURE REVIEWS DRUG DISCOVERY, 2023, 22 (02) : 99 - 99
  • [30] Pasting in large DNA sequences
    Sarah Crunkhorn
    Nature Reviews Drug Discovery, 2023, 22 : 99 - 99