Improving mammalian genome scaffolding using large insert mate-pair next-generation sequencing

被引:27
|
作者
van Heesch, Sebastiaan [1 ,2 ]
Kloosterman, Wigard P. [3 ]
Lansu, Nico [1 ,2 ]
Ruzius, Frans-Paul [1 ,2 ]
Levandowsky, Elizabeth [4 ]
Lee, Clarence C. [4 ]
Zhou, Shiguo [5 ]
Goldstein, Steve [5 ]
Schwartz, David C. [5 ]
Harkins, Timothy T. [4 ]
Guryev, Victor [1 ,2 ,6 ]
Cuppen, Edwin [1 ,2 ,3 ]
机构
[1] Hubrecht Inst KNAW, NL-3584 CT Utrecht, Netherlands
[2] Univ Med Ctr Utrecht, NL-3584 CT Utrecht, Netherlands
[3] UMC Utrecht, Dept Med Genet, NL-3584 GG Utrecht, Netherlands
[4] Life Technol Inc, Adv Applicat Grp, Cummings Ctr 500, Beverly, MA 01915 USA
[5] Univ Wisconsin, Dept Chem, UW Biotechnol Ctr, Lab Mol & Computat Genom,Lab Genet, Madison, WI 53706 USA
[6] Univ Groningen, Univ Med Ctr Groningen, European Res Inst Biol Ageing, Lab Genome Struct & Ageing, NL-9713 AV Groningen, Netherlands
来源
BMC GENOMICS | 2013年 / 14卷
基金
美国国家卫生研究院;
关键词
Genome structure; Genome scaffolding; Mate-pair next-generation sequencing; Contig assembly; Rat genome; STRUCTURAL VARIATION; GEL-ELECTROPHORESIS; CANCER GENOMES; CHROMOTHRIPSIS; DNA; REARRANGEMENTS; ASSEMBLIES; RESOLUTION; EVOLUTION; PATTERNS;
D O I
10.1186/1471-2164-14-257
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Paired-tag sequencing approaches are commonly used for the analysis of genome structure. However, mammalian genomes have a complex organization with a variety of repetitive elements that complicate comprehensive genome-wide analyses. Results: Here, we systematically assessed the utility of paired-end and mate-pair (MP) next-generation sequencing libraries with insert sizes ranging from 170 bp to 25 kb, for genome coverage and for improving scaffolding of a mammalian genome (Rattus norvegicus). Despite a lower library complexity, large insert MP libraries (20 or 25 kb) provided very high physical genome coverage and were found to efficiently span repeat elements in the genome. Medium-sized (5, 8 or 15 kb) MP libraries were much more efficient for genome structure analysis than the more commonly used shorter insert paired-end and 3 kb MP libraries. Furthermore, the combination of medium-and large insert libraries resulted in a 3-fold increase in N50 in scaffolding processes. Finally, we show that our data can be used to evaluate and improve contig order and orientation in the current rat reference genome assembly. Conclusions: We conclude that applying combinations of mate-pair libraries with insert sizes that match the distributions of repetitive elements improves contig scaffolding and can contribute to the finishing of draft genomes.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] An exhaustive algorithm for detecting copy number aberrations and large structural variants in whole-genome mate-pair sequencing data
    Asmann, Yan W.
    Wang, Chen
    Necela, Brian M.
    Chen, Xianfeng
    Kocher, Jean-Pierre A.
    Maurer, Matthew J.
    Habermann, Thomas M.
    Slager, Susan L.
    Feldman, Andrew L.
    Novak, Anne J.
    Cerhan, James R.
    Perez, Edith A.
    Thompson, E. Aubrey
    [J]. CANCER RESEARCH, 2015, 75
  • [22] Accurate Breakpoint Mapping in Apparently Balanced Translocation Families with Discordant Phenotypes Using Whole Genome Mate-Pair Sequencing
    Aristidou, Constantia
    Koufaris, Costas
    Theodosiou, Athina
    Bak, Mads
    Mehrjouy, Mana M.
    Behjati, Farkhondeh
    Tanteles, George
    Christophidou-Anastasiadou, Violetta
    Tommerup, Niels
    Sismani, Carolina
    [J]. PLOS ONE, 2017, 12 (01):
  • [23] Assessment of isochromosome 12p and 12p abnormalities in germ cell tumors using fluorescence in situ hybridization, single-nucleotide polymorphism arrays, and next-generation sequencing/mate-pair sequencing
    Freitag, C. Eric
    Sukov, William R.
    Bryce, Alan H.
    Berg, Jamie, V
    Vanderbilt, Chad M.
    Shen, Wei
    Smadbeck, James B.
    Greipp, Patricia T.
    Ketterling, Rhett P.
    Jenkins, Robert B.
    Herrera-Hernandez, Loren
    Costello, Brian A.
    Thompson, R. Houston
    Boorjian, Stephen A.
    Leibovich, Bradley C.
    Jimenez, Rafael E.
    Murphy, Stephen J.
    Vasmatzis, George
    Cheville, John C.
    Gupta, Sounak
    [J]. HUMAN PATHOLOGY, 2021, 112 : 20 - 34
  • [24] Next-Generation Sequencing and Genome Editing in Plant Virology
    Hadidi, Ahmed
    Flores, Ricardo
    Candresse, Thierry
    Barba, Marina
    [J]. FRONTIERS IN MICROBIOLOGY, 2016, 7
  • [25] Next-generation sequencing strategies for characterizing the turkey genome
    Dalloul, Rami A.
    Zimin, Aleksey V.
    Settlage, Robert E.
    Kim, Sungwon
    Reed, Kent M.
    [J]. POULTRY SCIENCE, 2014, 93 (02) : 479 - 484
  • [26] A GENOME ASSEMBLY PLATFORM FOR NEXT-GENERATION SEQUENCING TECHNOLOGY
    Lu Wenwen
    Lu Zhiyuan
    Wang Yaxu
    Sun Xiao
    [J]. IFPT'6: PROGRESS ON POST-GENOME TECHNOLOGIES, PROCEEDINGS, 2009, : 166 - 167
  • [27] Whole Cancer Genome Sequencing by Next-Generation Methods
    Ross, Jeffrey S.
    Cronin, Maureen
    [J]. AMERICAN JOURNAL OF CLINICAL PATHOLOGY, 2011, 136 (04) : 527 - 539
  • [28] Exploring the cancer genome in the era of next-generation sequencing
    Hui Dong
    Shengyue Wang
    [J]. Frontiers of Medicine, 2012, 6 (1) : 48 - 55
  • [29] Genome regulation and evolution analysed by next-generation sequencing
    Baehler, Juerg
    [J]. SEMINARS IN CELL & DEVELOPMENTAL BIOLOGY, 2012, 23 (02) : 191 - 191
  • [30] The Genome Assembly Model for Next-Generation Sequencing Data
    Wang, Yirong
    Wei, Chengdong
    Zhang, Xiaodong
    Cen, Tailin
    [J]. PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON APPLIED MATHEMATICS, MODELLING AND STATISTICS APPLICATION (AMMSA 2017), 2017, 141 : 97 - 101