High-fidelity (repeat) consensus sequences from short reads using combined read clustering and assembly

被引:3
|
作者
Mann, Ludwig [1 ]
Balasch, Kristin [1 ]
Schmidt, Nicola [1 ]
Heitkam, Tony [1 ,2 ]
机构
[1] Tech Univ Dresden, Fac Biol, D-01069 Dresden, Germany
[2] Karl Franzens Univ Graz, Inst Biol, NAWI Graz, A-8010 Graz, Austria
关键词
Repetitive DNA; Transposable elements; Consensus sequences; Repeat assembly; Repeat clustering; eccDNA; Ribosomal DNA; rDNA; Non-model organisms; MALE-FERTILE; GENOME; DNA; TRANSCRIPTION; PLANTS;
D O I
10.1186/s12864-023-09948-4
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
BackgroundDespite the many cheap and fast ways to generate genomic data, good and exact genome assembly is still a problem, with especially the repeats being vastly underrepresented and often misassembled. As short reads in low coverage are already sufficient to represent the repeat landscape of any given genome, many read cluster algorithms were brought forward that provide repeat identification and classification. But how can trustworthy, reliable and representative repeat consensuses be derived from unassembled genomes?ResultsHere, we combine methods from repeat identification and genome assembly to derive these robust consensuses. We test several use cases, such as (1) consensus building from clustered short reads of non-model genomes, (2) from genome-wide amplification setups, and (3) specific repeat-centred questions, such as the linked vs. unlinked arrangement of ribosomal genes. In all our use cases, the derived consensuses are robust and representative. To evaluate overall performance, we compare our high-fidelity repeat consensuses to RepeatExplorer2-derived contigs and check, if they represent real transposable elements as found in long reads. Our results demonstrate that it is possible to generate useful, reliable and trustworthy consensuses from short reads by a combination from read cluster and genome assembly methods in an automatable way.ConclusionWe anticipate that our workflow opens the way towards more efficient and less manual repeat characterization and annotation, benefitting all genome studies, but especially those of non-model organisms.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] HGA: de novo genome assembly method for bacterial genomes using high coverage short sequencing reads
    Al-okaily, Anas A.
    BMC GENOMICS, 2016, 17
  • [42] High-fidelity facial reconstruction from a single photo using photo-realistic rendering
    Dias, Mariana
    Roche, Alexis
    Fernandes, Margarida
    Orvalho, Veronica
    PROCEEDINGS SIGGRAPH 2022 TALKS, 2022,
  • [43] HIGH-FIDELITY AMPLIFICATION USING A THERMOSTABLE DNA-POLYMERASE ISOLATED FROM PYROCOCCUS-FURIOSUS
    LUNDBERG, KS
    SHOEMAKER, DD
    ADAMS, MWW
    SHORT, JM
    SORGE, JA
    MATHUR, EJ
    GENE, 1991, 108 (01) : 1 - 6
  • [44] Factorial estimating assembly base errors using k-mer abundance difference (KAD) between short reads and genome assembled sequences
    He, Cheng
    Lin, Guifang
    Wei, Hairong
    Tang, Haibao
    White, Frank F.
    Valent, Barbara
    Liu, Sanzhen
    NAR GENOMICS AND BIOINFORMATICS, 2020, 2 (03)
  • [45] Road Segmentation from High-Fidelity Remote Sensing Images using a Context Information Capture Network
    Zhu, Yuting
    Long, Lihong
    Wang, Jinjie
    Yan, Jingwen
    Wang, Xiaoqing
    COGNITIVE COMPUTATION, 2022, 14 (02) : 780 - 793
  • [46] Optimize training using a high-fidelity simulator Here are lessons learned from a Russian methanol plant
    Kotsuba, D.
    Gareyshin, M.
    Stavrakas, D.
    Pallis, T.
    Harismiadis, V.
    HYDROCARBON PROCESSING, 2012, 91 (06): : 59 - 63
  • [47] High-Fidelity Self-Assembly of Crystalline and Parallel-Oriented Organic Thin Films by π-π Stacking from a Metal Surface
    Skomski, Daniel
    Jo, Junyong
    Tempas, Christopher D.
    Kim, Seyong
    Lee, Dongwhan
    Tait, Steven L.
    LANGMUIR, 2014, 30 (33) : 10050 - 10056
  • [48] NeuralHDHair: Automatic High-fidelity Hair Modeling from a Single Image Using Implicit Neural Representations
    Wu, Keyu
    Ye, Yifan
    Yang, Lingchen
    Fu, Hongbo
    Zhou, Kun
    Zhengl, Youyi
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1516 - 1525
  • [49] Road Segmentation from High-Fidelity Remote Sensing Images using a Context Information Capture Network
    Yuting Zhu
    Lihong Long
    Jinjie Wang
    Jingwen Yan
    Xiaoqing Wang
    Cognitive Computation, 2022, 14 : 780 - 793
  • [50] ALIENTRIMMER: A tool to quickly and accurately trim off multiple short contaminant sequences from high-throughput sequencing reads
    Criscuolo, Alexis
    Brisse, Sylvain
    GENOMICS, 2013, 102 (5-6) : 500 - 506