PCAP: A whole-genome assembly program

被引:187
|
作者
Huang, XQ [1 ]
Wang, JM
Aluru, S
Yang, SP
Hillier, L
机构
[1] Iowa State Univ, Dept Comp Sci, Ames, IA 50011 USA
[2] Iowa State Univ, Dept Elect & Comp Engn, Ames, IA 50011 USA
[3] Washington Univ, Sch Med, Genome Sequencing Ctr, St Louis, MO 63108 USA
关键词
D O I
10.1101/gr.1390403
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We describe a whole-genome assembly program named PCAP for processing tens of millions of reads. The PCAP program has several features to address efficiency and accuracy issues in assembly. Multiple processors are used to perform most time-consuming computations in assembly. A more sensitive method is used to avoid missing overlaps caused by sequencing errors. Repetitive regions of reads are detected oil the basis of many overlaps with other reads, instead of many shorter word matches with other reads. Contaminated end regions of reads are identified and removed. Generation of a consensus sequence for a contig is based on an alignment of reads in the contig, in which both base quality values and coverage information are used to determine every consensus base. The PCAP program was tested on a mouse whole-genome data set of 30 million reads and a human Chromosome 20 data set of 1.7 million reads. The program is freely available for academic use.
引用
收藏
页码:2164 / 2170
页数:7
相关论文
共 50 条
  • [21] A Crowdsourced Gameplay for Whole-Genome Assembly via Short Reads
    Gamage, G.
    Perera, I.
    Meedeniya, D.
    Welivita, Anuradha
    INTERNATIONAL JOURNAL OF ONLINE AND BIOMEDICAL ENGINEERING, 2020, 16 (08) : 68 - 84
  • [22] Whole-genome sequence assembly for mammalian genomes: Arachne 2
    Jaffe, DB
    Butler, J
    Gnerre, S
    Mauceli, E
    Lindblad-Toh, K
    Mesirov, JP
    Zody, MC
    Lander, ES
    GENOME RESEARCH, 2003, 13 (01) : 91 - 96
  • [23] Comparative analysis of algorithms for whole-genome assembly of pyrosequencing data
    Finotello, Francesca
    Lavezzo, Enrico
    Fontana, Paolo
    Peruzzo, Denis
    Albiero, Alessandro
    Barzon, Luisa
    Falda, Marco
    Di Camillo, Barbara
    Toppo, Stefano
    BRIEFINGS IN BIOINFORMATICS, 2012, 13 (03) : 269 - 280
  • [24] Genome assembly with in vitro proximity ligation data and whole-genome triplication in lettuce
    Sebastian Reyes-Chin-Wo
    Zhiwen Wang
    Xinhua Yang
    Alexander Kozik
    Siwaret Arikit
    Chi Song
    Liangfeng Xia
    Lutz Froenicke
    Dean O. Lavelle
    María-José Truco
    Rui Xia
    Shilin Zhu
    Chunyan Xu
    Huaqin Xu
    Xun Xu
    Kyle Cox
    Ian Korf
    Blake C. Meyers
    Richard W. Michelmore
    Nature Communications, 8
  • [25] Whole-genome assembly and annotation of the firecracker penstemon (Penstemon eatonii)
    Jarvis, David E.
    Stevens, Mikel R.
    Carter, Payton
    Lin, Ying Fei
    Jaggi, Kate E.
    Jijon, Gabriela
    Kalt, Teal
    Calixto, Jouber
    Standring, Samantha
    Torres, Kaitlin
    Stephensen, Kayla B.
    Mangelson, Hayley
    Williams, Noah H.
    Wessinger, Carolyn A.
    Maughan, Peter J.
    Frandsen, Paul B.
    JOURNAL OF HEREDITY, 2024,
  • [26] Whole-genome assembly and analysis of a medicinal fungus: Inonotus hispidus
    Tang, Shaojun
    Jin, Lei
    Lei, Pin
    Shao, Chenxia
    Wu, Shenlian
    Yang, Yi
    He, Yuelin
    Ren, Rui
    Xu, Jun
    FRONTIERS IN MICROBIOLOGY, 2022, 13
  • [27] Whole-genome assembly and annotation of northern wild rice, Zizania palustris L., supports a whole-genome duplication in the Zizania genus
    Haas, Matthew
    Kono, Thomas
    Macchietto, Marissa
    Millas, Reneth
    McGilp, Lillian
    Shao, Mingqin
    Duquette, Jacques
    Hirsch, Candice N.
    Kimball, Jennifer
    Qiu, Yinjie
    PLANT JOURNAL, 2021, 107 (06): : 1802 - 1818
  • [28] Whole-genome patenting
    O'Malley, MA
    Bostanci, A
    Calvert, L
    NATURE REVIEWS GENETICS, 2005, 6 (06) : 502 - 506
  • [29] Whole-genome sequencing
    Morris, Huw R.
    Houlden, Henry
    Polke, James
    PRACTICAL NEUROLOGY, 2021, 21 (04) : 322 - +
  • [30] Whole-genome genotyping
    Gunderson, Kevin L.
    Steemers, Frank J.
    Ren, Hongi
    Ng, Pauline
    Zhou, Lixin
    Tsan, Chan
    Chang, Weihua
    Bullis, Dave
    Musmacker, Joe
    King, Christine
    Lebruska, Lori L.
    Barker, David
    Oliphant, Arnold
    Kuhn, Kenneth M.
    Shen, Richard
    DNA MICROARRAYS PART A: ARRAY PLATFORMS AND WET-BENCH PROTOCOLS, 2006, 410 : 359 - +