Exhaustive whole-genome tandem repeats search

被引:27
|
作者
Krishnan, A [1 ]
Tang, F [1 ]
机构
[1] Bioinformat Inst, Singapore 138671, Singapore
关键词
D O I
10.1093/bioinformatics/bth311
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Approximate tandem repeats (ATR) occur frequently in the genomes of organisms, and are a source of polymorphisms observed in individuals, and thus are of interest to those studying genetic disorders. Though extensive work has been done in order to identify ATRs, there are inherent limitations with the current approaches in terms of the number of pattern sizes that can be searched or the size of the input length. Results: This paper describes (1) a new algorithm which exhaustively finds all variable-length ATRs in a genomic sequence and (2) a precise description of, and an algorithm to significantly reduce, redundancy in the output. Our ATR definition is parameterized by a mismatch ratio p which allows for more mismatches in longer tandem repeats (and fewer in shorter). Furthermore, our algorithm is embarrassingly parallel and thus can attain near-linear speed-up on Beowulf clusters. We present results of our algorithm applied to sequences of widely differing lengths (from genes to chromosomes).
引用
收藏
页码:2702 / 2710
页数:9
相关论文
共 50 条
  • [1] STRScan: targeted profiling of short tandem repeats in whole-genome sequencing data
    Haixu Tang
    Etienne Nzabarushimana
    BMC Bioinformatics, 18
  • [2] STRScan: targeted profiling of short tandem repeats in whole-genome sequencing data
    Tang, Haixu
    Nzabarushimana, Etienne
    BMC BIOINFORMATICS, 2017, 18
  • [3] Dot2dot: accurate whole-genome tandem repeats discovery
    Genovese, Loredana M.
    Mosca, Marco M.
    Pellegrini, Marco
    Geraci, Filippo
    BIOINFORMATICS, 2019, 35 (06) : 914 - 922
  • [4] Investigation of short tandem repeats in major depression using whole-genome sequencing data
    Yu, Chenglong
    Baune, Bernhard T.
    Wong, Ma-Li
    Licinio, Julio
    JOURNAL OF AFFECTIVE DISORDERS, 2018, 232 : 305 - 309
  • [5] Characterizing Repeats in Two Whole-Genome Amplification Methods in the Reniform Nematode Genome
    Nyaku, S. T.
    Sripathi, V. R.
    Lawrence, K.
    Sharma, G.
    INTERNATIONAL JOURNAL OF GENOMICS, 2021, 2021
  • [6] Search of Tandem Repeats with Insertion and Deletions in the A. thaliana Genome
    Korotkov, E. V.
    Suvorova, Yu. M.
    Skryabin, K. G.
    DOKLADY BIOCHEMISTRY AND BIOPHYSICS, 2017, 477 (01) : 398 - 400
  • [7] Search of tandem repeats with insertion and deletions in the A. thaliana genome
    E. V. Korotkov
    Yu. M. Suvorova
    K. G. Skryabin
    Doklady Biochemistry and Biophysics, 2017, 477 : 398 - 400
  • [8] A binary search approach to whole-genome data analysis
    Brodsky, Leonid
    Kogan, Simon
    BenJacob, Eshel
    Nevo, Eviatar
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2010, 107 (39) : 16893 - 16898
  • [9] How Long Are Long Tandem Repeats? A Challenge for Current Methods of Whole-Genome Sequence Assembly: The Case of Satellites in Caenorhabditis elegans
    Subirana, Juan A.
    Messeguer, Xavier
    GENES, 2018, 9 (10):
  • [10] SeqAnt: Cloud-Based Whole-Genome Annotation and Search
    Kotlar, Alex V.
    Trevino, Cristina E.
    Zwick, Michael E.
    Cutler, David J.
    Wingo, Thomas S.
    ACM-BCB' 2017: PROCEEDINGS OF THE 8TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY,AND HEALTH INFORMATICS, 2017, : 621 - 621