Targeted Assembly of Short Sequence Reads

被引:25
|
作者
Warren, Rene L. [1 ]
Holt, Robert A. [1 ,2 ]
机构
[1] British Columbia Canc Agcy, Genome Sci Ctr, Vancouver, BC V5Z 4E6, Canada
[2] Simon Fraser Univ, Dept Mol Biol & Biochem, Burnaby, BC V5A 1S6, Canada
来源
PLOS ONE | 2011年 / 6卷 / 05期
关键词
SHORT DNA-SEQUENCES; GENOME; MAP;
D O I
10.1371/journal.pone.0019816
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
As next-generation sequence (NGS) production continues to increase, analysis is becoming a significant bottleneck. However, in situations where information is required only for specific sequence variants, it is not necessary to assemble or align whole genome data sets in their entirety. Rather, NGS data sets can be mined for the presence of sequence variants of interest by localized assembly, which is a faster, easier, and more accurate approach. We present TASR, a streamlined assembler that interrogates very large NGS data sets for the presence of specific variants by only considering reads within the sequence space of input target sequences provided by the user. The NGS data set is searched for reads with an exact match to all possible short words within the target sequence, and these reads are then assembled stringently to generate a consensus of the target and flanking sequence. Typically, variants of a particular locus are provided as different target sequences, and the presence of the variant in the data set being interrogated is revealed by a successful assembly outcome. However, TASR can also be used to find unknown sequences that flank a given target. We demonstrate that TASR has utility in finding or confirming genomic mutations, polymorphisms, fusions and integration events. Targeted assembly is a powerful method for interrogating large data sets for the presence of sequence variants of interest. TASR is a fast, flexible and easy to use tool for targeted assembly.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] A new algorithm for genome assembly from short reads
    Blazewicz, Jacek
    Bryja, Marcin
    Figlerowicz, Marek
    Gawron, Piotr
    Kasprzak, Marta
    Platt, Darren
    Przybytek, Jakub
    Swiercz, Aleksandra
    Szajkowski, Lukasz
    PROCEEDINGS OF THE 2008 1ST INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY, 2008, : 455 - +
  • [22] Assembly complexity of prokaryotic genomes using short reads
    Carl Kingsford
    Michael C Schatz
    Mihai Pop
    BMC Bioinformatics, 11 (1)
  • [23] Exact Transcriptome Reconstruction from Short Sequence Reads
    Lacroix, Vincent
    Sammeth, Michael
    Guigo, Roderic
    Bergeron, Anne
    ALGORITHMS IN BIOINFORMATICS, WABI 2008, 2008, 5251 : 50 - +
  • [24] Erratum: Sense from sequence reads: methods for alignment and assembly
    Paul Flicek
    Ewan Birney
    Nature Methods, 2010, 7 : 479 - 479
  • [26] Iterative Learning for Reference-Guided DNA Sequence Assembly From Short Reads: Algorithms and Limits of Performance
    Shen, Xiaohu
    Shamaiah, Manohar
    Vikalo, Haris
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2014, 62 (17) : 4425 - 4435
  • [27] Evaluation of CircRNA Sequence Assembly Methods Using Long Reads
    Zhang, Jingjing
    Hossain, Md. Tofazzal
    Liu, Weiguo
    Peng, Yin
    Pan, Yi
    Wei, Yanjie
    FRONTIERS IN GENETICS, 2022, 13
  • [28] A consistency-based consensus algorithm for de novo and reference-guided sequence assembly of short reads
    Rausch, Tobias
    Koren, Sergey
    Denisov, Gennady
    Weese, David
    Emde, Anne-Katrin
    Doering, Andreas
    Reinert, Knut
    BIOINFORMATICS, 2009, 25 (09) : 1118 - 1124
  • [29] Error Correction and DeNovo Genome Assembly for the MinION Sequencing Reads mixing Illumina Short Reads
    Kchouk, Mehdi
    Elloumi, Mourad
    PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2015, : 1785 - 1785
  • [30] Merging short and stranded long reads improves transcript assembly
    Kainth A.S.
    Haddad G.A.
    Hall J.M.
    Ruthenburg A.J.
    PLoS Computational Biology, 2023, 19 (10 October)