Decomposing mosaic tandem repeats accurately from long reads

被引:4
|
作者
Masutani, Bansho [1 ]
Kawahara, Riki [1 ]
Morishita, Shinichi [1 ]
机构
[1] Univ Tokyo, Grad Sch Frontier Sci, Dept Computat Biol & Med Sci, Chiba 2778562, Japan
关键词
EXPANSION; DNA; SEQUENCES; EVOLUTION; GLOBIN;
D O I
10.1093/bioinformatics/btad185
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Over the past 30 years, extended tandem repeats (TRs) have been correlated with similar to 60 diseases with high odds ratios, and most known TRs consist of single repeat units. However, in the last few years, mosaic TRs composed of different units have been found to be associated with several brain disorders by long-read sequencing techniques. Mosaic TRs are difficult-to-characterize sequence configurations that are usually confirmed by manual inspection. Widely used tools are not designed to solve the mosaic TR problem and often fail to properly decompose mosaic TRs. Results: We propose an efficient algorithm that can decompose mosaic TRs in the input string with high sensitivity. Using synthetic benchmark data, we demonstrate that our program named uTR outperforms TRF and RepeatMasker in terms of prediction accuracy, this is especially true when mosaic TRs are more complex, and uTR is faster than TRF and RepeatMasker in most cases.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] Characterization of telomere variant repeats using long reads enables allele-specific telomere length estimation
    Stephens, Zachary
    Kocher, Jean-Pierre
    BMC BIOINFORMATICS, 2024, 25 (01):
  • [32] Retrieval of long DNA reads from herbarium specimens
    Quatela, Anne-Sophie
    Cangren, Patrik
    Jafari, Farzaneh
    Michel, Thibauld
    de Boer, Hugo J.
    Oxelman, Bengt
    AOB PLANTS, 2023, 15 (06):
  • [33] ALLPATHS 2: small genomes assembled accurately and with high continuity from short paired reads
    Iain MacCallum
    Dariusz Przybylski
    Sante Gnerre
    Joshua Burton
    Ilya Shlyakhter
    Andreas Gnirke
    Joel Malek
    Kevin McKernan
    Swati Ranade
    Terrance P Shea
    Louise Williams
    Sarah Young
    Chad Nusbaum
    David B Jaffe
    Genome Biology, 10
  • [34] HairSplitter: haplotype assembly from long, noisy reads
    Faure, Roland
    Lavenier, Dominique
    Flot, Jean-Francois
    PEER COMMUNITY JOURNAL, 2024, 4
  • [35] ALLPATHS 2: small genomes assembled accurately and with high continuity from short paired reads
    MacCallum, Iain
    Przybylski, Dariusz
    Gnerre, Sante
    Burton, Joshua
    Shlyakhter, Ilya
    Gnirke, Andreas
    Malek, Joel
    McKernan, Kevin
    Ranade, Swati
    Shea, Terrance P.
    Williams, Louise
    Young, Sarah
    Nusbaum, Chad
    Jaffe, David B.
    GENOME BIOLOGY, 2009, 10 (10):
  • [36] WHY THE LONG FACE?: RUNX2 TANDEM REPEATS AND THE EVOLUTION OF PRIMATE PROGNATHISM
    Bradley, Brenda
    Chester, Stephen
    Asher, Robert
    JOURNAL OF VERTEBRATE PALEONTOLOGY, 2009, 29 : 69A - 69A
  • [37] Enhancer Function of MicroRNA-3681 Derived from Long Terminal Repeats Represses the Activity of Variable Number Tandem Repeats in the 3′ UTR of SHISA7
    Lee, Hee-Eun
    Park, Sang-Je
    Huh, Jae-Won
    Imai, Hiroo
    Kim, Heui-Soo
    MOLECULES AND CELLS, 2020, 43 (07) : 607 - 618
  • [38] Haplotype threading: accurate polyploid phasing from long reads
    Sven D. Schrinner
    Rebecca Serra Mari
    Jana Ebler
    Mikko Rautiainen
    Lancelot Seillier
    Julia J. Reimer
    Björn Usadel
    Tobias Marschall
    Gunnar W. Klau
    Genome Biology, 21
  • [39] Haplotype threading: accurate polyploid phasing from long reads
    Schrinner, Sven D.
    Mari, Rebecca Serra
    Ebler, Jana
    Rautiainen, Mikko
    Seillier, Lancelot
    Reimer, Julia J.
    Usadel, Bjoern
    Marschall, Tobias
    Klau, Gunnar W.
    GENOME BIOLOGY, 2020, 21 (01)
  • [40] Detection and visualization of complex structural variants from long reads
    Stephens, Zachary
    Wang, Chen
    Iyer, Ravishankar K.
    Kocher, Jean-Pierre
    BMC BIOINFORMATICS, 2018, 19