Background: High quality annotation of the genes and transposable elements in complex genomes requires a human-curated integration of multiple sources of computational evidence. These evidences include results from a diversity of ab initio prediction programs as well as homology-based searches. Most of these programs operate on a single contiguous sequence at a time, and the results are generated in a diverse array of readable formats that must be translated to a standardized file format. These translated results must then be concatenated into a single source, and then presented in an integrated form for human curation. Results: We have designed, implemented, and assessed a Perl-based workflow named DAWGPAWS for the generation of computational results for human curation of the genes and transposable elements in plant genomes. The use of DAWGPAWS was found to accelerate annotation of 80-200 kb wheat DNA inserts in bacterial artificial chromosome (BAC) vectors by approximately twenty-fold and to also significantly improve the quality of the annotation in terms of completeness and accuracy. Conclusion: The DAWGPAWS genome annotation pipeline fills an important need in the annotation of plant genomes by generating computational evidences in a high throughput manner, translating these results to a common file format, and facilitating the human curation of these computational results. We have verified the value of DAWGPAWS by using this pipeline to annotate the genes and transposable elements in 220 BAC insertions from the hexaploid wheat genome (Triticum aestivum L.). DAWGPAWS can be applied to annotation efforts in other plant genomes with minor modifications of program-specific configuration files, and the modular design of the workflow facilitates integration into existing pipelines.
机构:
Virginia Tech, Virginia Bioinformat Inst, Blacksburg, VA USA
Virginia Tech, Dept Comp Sci, Blacksburg, VA USAVirginia Tech, Virginia Bioinformat Inst, Blacksburg, VA USA
Warren, Andrew S.
Archuleta, Jeremy
论文数: 0引用数: 0
h-index: 0
机构:
Virginia Tech, Dept Comp Sci, Blacksburg, VA USAVirginia Tech, Virginia Bioinformat Inst, Blacksburg, VA USA
Archuleta, Jeremy
Feng, Wu-chun
论文数: 0引用数: 0
h-index: 0
机构:
Virginia Tech, Dept Comp Sci, Blacksburg, VA USAVirginia Tech, Virginia Bioinformat Inst, Blacksburg, VA USA
Feng, Wu-chun
Setubal, Joao Carlos
论文数: 0引用数: 0
h-index: 0
机构:
Virginia Tech, Virginia Bioinformat Inst, Blacksburg, VA USA
Virginia Tech, Dept Comp Sci, Blacksburg, VA USAVirginia Tech, Virginia Bioinformat Inst, Blacksburg, VA USA
机构:
Univ Fed Rio Grande Do Sul, Programa Posgrad Genet & Biol Mol, Porto Alegre, RS, BrazilUniv Fed Rio Grande Do Sul, Programa Posgrad Genet & Biol Mol, Porto Alegre, RS, Brazil
Gomes, Tiago M. F. F.
de Melo, Elverson S.
论文数: 0引用数: 0
h-index: 0
机构:
Fundacao Oswaldo Cruz, Dept Entomol, Inst Aggeu Magalhaes, Recife, PE, BrazilUniv Fed Rio Grande Do Sul, Programa Posgrad Genet & Biol Mol, Porto Alegre, RS, Brazil
de Melo, Elverson S.
Wallau, Gabriel L.
论文数: 0引用数: 0
h-index: 0
机构:
Fundacao Oswaldo Cruz, Dept Entomol, Inst Aggeu Magalhaes, Recife, PE, BrazilUniv Fed Rio Grande Do Sul, Programa Posgrad Genet & Biol Mol, Porto Alegre, RS, Brazil
Wallau, Gabriel L.
Loreto, Elgion L. S.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Fed Rio Grande Do Sul, Programa Posgrad Genet & Biol Mol, Porto Alegre, RS, Brazil
Univ Fed Santa Maria, Dept Bioquim & Biol Mol, Santa Maria, RS, Brazil
Univ Fed Santa Maria, Av Roraima 1000, BR-97105900 Santa Maria, RS, BrazilUniv Fed Rio Grande Do Sul, Programa Posgrad Genet & Biol Mol, Porto Alegre, RS, Brazil
机构:
INRA, Biol Cellulaire Lab, Inst Jean Pierre Bourgin, F-78026 Versailles, France
Univ Oslo, Nat Hist Museum, Natl Ctr Biosystemat, N-0318 Oslo, NorwayINRA, Biol Cellulaire Lab, Inst Jean Pierre Bourgin, F-78026 Versailles, France
Parisod, Christian
Alix, Karine
论文数: 0引用数: 0
h-index: 0
机构:
Univ Paris 11, INRA, CNRS, UMR Genet Vegetale, F-91190 Gif Sur Yvette, FranceINRA, Biol Cellulaire Lab, Inst Jean Pierre Bourgin, F-78026 Versailles, France
Alix, Karine
Just, Jeremy
论文数: 0引用数: 0
h-index: 0
机构:
INRA, Unite Rech Genom Vegetale, F-91057 Evry, FranceINRA, Biol Cellulaire Lab, Inst Jean Pierre Bourgin, F-78026 Versailles, France
Just, Jeremy
Petit, Maud
论文数: 0引用数: 0
h-index: 0
机构:
INRA, Biol Cellulaire Lab, Inst Jean Pierre Bourgin, F-78026 Versailles, FranceINRA, Biol Cellulaire Lab, Inst Jean Pierre Bourgin, F-78026 Versailles, France
Petit, Maud
Sarilar, Veronique
论文数: 0引用数: 0
h-index: 0
机构:
Univ Paris 11, INRA, CNRS, UMR Genet Vegetale, F-91190 Gif Sur Yvette, FranceINRA, Biol Cellulaire Lab, Inst Jean Pierre Bourgin, F-78026 Versailles, France
Sarilar, Veronique
Mhiri, Corinne
论文数: 0引用数: 0
h-index: 0
机构:
INRA, Biol Cellulaire Lab, Inst Jean Pierre Bourgin, F-78026 Versailles, FranceINRA, Biol Cellulaire Lab, Inst Jean Pierre Bourgin, F-78026 Versailles, France
Mhiri, Corinne
Ainouche, Malika
论文数: 0引用数: 0
h-index: 0
机构:
Univ Rennes 1, UMR ECOBIO 6553, F-35042 Rennes, FranceINRA, Biol Cellulaire Lab, Inst Jean Pierre Bourgin, F-78026 Versailles, France
Ainouche, Malika
Chalhoub, Boulos
论文数: 0引用数: 0
h-index: 0
机构:
INRA, Unite Rech Genom Vegetale, F-91057 Evry, FranceINRA, Biol Cellulaire Lab, Inst Jean Pierre Bourgin, F-78026 Versailles, France
Chalhoub, Boulos
Grandbastien, Marie-Angele
论文数: 0引用数: 0
h-index: 0
机构:
INRA, Biol Cellulaire Lab, Inst Jean Pierre Bourgin, F-78026 Versailles, FranceINRA, Biol Cellulaire Lab, Inst Jean Pierre Bourgin, F-78026 Versailles, France