Decomposing mosaic tandem repeats accurately from long reads

被引:4
|
作者
Masutani, Bansho [1 ]
Kawahara, Riki [1 ]
Morishita, Shinichi [1 ]
机构
[1] Univ Tokyo, Grad Sch Frontier Sci, Dept Computat Biol & Med Sci, Chiba 2778562, Japan
关键词
EXPANSION; DNA; SEQUENCES; EVOLUTION; GLOBIN;
D O I
10.1093/bioinformatics/btad185
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Over the past 30 years, extended tandem repeats (TRs) have been correlated with similar to 60 diseases with high odds ratios, and most known TRs consist of single repeat units. However, in the last few years, mosaic TRs composed of different units have been found to be associated with several brain disorders by long-read sequencing techniques. Mosaic TRs are difficult-to-characterize sequence configurations that are usually confirmed by manual inspection. Widely used tools are not designed to solve the mosaic TR problem and often fail to properly decompose mosaic TRs. Results: We propose an efficient algorithm that can decompose mosaic TRs in the input string with high sensitivity. Using synthetic benchmark data, we demonstrate that our program named uTR outperforms TRF and RepeatMasker in terms of prediction accuracy, this is especially true when mosaic TRs are more complex, and uTR is faster than TRF and RepeatMasker in most cases.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Finding long tandem repeats in long noisy reads
    Morishita, Shinichi
    Ichikawa, Kazuki
    Myers, Eugene W.
    BIOINFORMATICS, 2021, 37 (05) : 612 - 621
  • [2] Resolving complex tandem repeats with long reads
    Ummat, Ajay
    Bashir, Ali
    BIOINFORMATICS, 2014, 30 (24) : 3491 - 3498
  • [3] LongTR: genome-wide profiling of genetic variation at tandem repeats from long reads
    Jam, Helyaneh Ziaei
    Zook, Justin M.
    Javadzadeh, Sara
    Park, Jonghun
    Sehgal, Aarushi
    Gymrek, Melissa
    GENOME BIOLOGY, 2024, 25 (01):
  • [4] TandemTools: mapping long reads and assessing/improving assembly quality in extra-long tandem repeats
    Mikheenko, Alla
    Bzikadze, Andrey, V
    Gurevich, Alexey
    Miga, Karen H.
    Pevzner, Pavel A.
    BIOINFORMATICS, 2020, 36 : 75 - 83
  • [5] RF: A method for filtering short reads with tandem repeats for genome mapping
    Misawa, Kazuharu
    GENOMICS, 2013, 102 (01) : 35 - 37
  • [6] ReviSTER: an automated pipeline to revise misaligned reads to simple tandem repeats
    Tae, Hongseok
    McMahon, Kevin W.
    Settlage, Robert E.
    Bavarva, Jasmin H.
    Garner, Harold R.
    BIOINFORMATICS, 2013, 29 (14) : 1734 - 1741
  • [7] Probably Correct: Rescuing Repeats with Short and Long Reads
    Cechova, Monika
    GENES, 2021, 12 (01) : 1 - 13
  • [8] Terminal long tandem repeats in chromosomes from Chironomus pallidivittatus
    Lopez, CC
    Nielsen, L
    Edstrom, JE
    MOLECULAR AND CELLULAR BIOLOGY, 1996, 16 (07) : 3285 - 3290
  • [9] Tandem-genotypes: robust detection of tandem repeat expansions from long DNA reads
    Satomi Mitsuhashi
    Martin C. Frith
    Takeshi Mizuguchi
    Satoko Miyatake
    Tomoko Toyota
    Hiroaki Adachi
    Yoko Oma
    Yoshihiro Kino
    Hiroaki Mitsuhashi
    Naomichi Matsumoto
    Genome Biology, 20
  • [10] Tandem-genotypes: robust detection of tandem repeat expansions from long DNA reads
    Mitsuhashi, Satomi
    Frith, Martin C.
    Mizuguchi, Takeshi
    Miyatake, Satoko
    Toyota, Tomoko
    Adachi, Hiroaki
    Oma, Yoko
    Kino, Yoshihiro
    Mitsuhashi, Hiroaki
    Matsumoto, Naomichi
    GENOME BIOLOGY, 2019, 20