Automated alignment-based curation of gene models in filamentous fungi

被引:7
|
作者
van der Burgt, Ate [1 ,2 ]
Severing, Edouard [2 ,3 ]
Collemare, Jerome [1 ]
de Wit, Pierre J. G. M. [1 ]
机构
[1] Univ Wageningen & Res Ctr, Lab Phytopathol, NL-6700 AA Wageningen, Netherlands
[2] Univ Wageningen & Res Ctr, NL-6700 AA Wageningen, Netherlands
[3] Univ Wageningen & Res Ctr, Lab Genet, NL-6700 AA Wageningen, Netherlands
来源
BMC BIOINFORMATICS | 2014年 / 15卷
关键词
Gene model; Automated gene model curation; Sequence error; Truncated gene model; Pseudogene; Fungal genome; Cladosporium fulvum; AB-INITIO; GENOME; PREDICTION; PROTEIN;
D O I
10.1186/1471-2105-15-19
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Automated gene-calling is still an error-prone process, particularly for the highly plastic genomes of fungal species. Improvement through quality control and manual curation of gene models is a time-consuming process that requires skilled biologists and is only marginally performed. The wealth of available fungal genomes has not yet been exploited by an automated method that applies quality control of gene models in order to obtain more accurate genome annotations. Results: We provide a novel method named alignment-based fungal gene prediction (ABFGP) that is particularly suitable for plastic genomes like those of fungi. It can assess gene models on a gene-by-gene basis making use of informant gene loci. Its performance was benchmarked on 6,965 gene models confirmed by full-length unigenes from ten different fungi. 79.4% of all gene models were correctly predicted by ABFGP. It improves the output of ab initio gene prediction software due to a higher sensitivity and precision for all gene model components. Applicability of the method was shown by revisiting the annotations of six different fungi, using gene loci from up to 29 fungal genomes as informants. Between 7,231 and 8,337 genes were assessed by ABFGP and for each genome between 1,724 and 3,505 gene model revisions were proposed. The reliability of the proposed gene models is assessed by an a posteriori introspection procedure of each intron and exon in the multiple gene model alignment. The total number and type of proposed gene model revisions in the six fungal genomes is correlated to the quality of the genome assembly, and to sequencing strategies used in the sequencing centre, highlighting different types of errors in different annotation pipelines. The ABFGP method is particularly successful in discovering sequence errors and/or disruptive mutations causing truncated and erroneous gene models. Conclusions: The ABFGP method is an accurate and fully automated quality control method for fungal gene catalogues that can be easily implemented into existing annotation pipelines. With the exponential release of new genomes, the ABFGP method will help decreasing the number of gene models that require additional manual curation.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Automated alignment-based curation of gene models in filamentous fungi
    Ate van der Burgt
    Edouard Severing
    Jérôme Collemare
    Pierre JGM de Wit
    [J]. BMC Bioinformatics, 15
  • [2] Alignment-based Transfer Learning for Robot Models
    Bocsi, Botond
    Csato, Lehel
    Peters, Jan
    [J]. 2013 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2013,
  • [3] ALIGNMENT-BASED CONFORMANCE CHECKING OF HIERARCHICAL PROCESS MODELS
    Wang, Lu
    Han, Xiao
    Qi, Man
    Wang, Kang
    Li, Peng
    [J]. COMPUTING AND INFORMATICS, 2024, 43 (02) : 149 - 180
  • [4] Alignment-Based Trace Clustering
    Chatain, Thomas
    Carmona, Josep
    van Dongen, Boudewijn
    [J]. CONCEPTUAL MODELING, ER 2017, 2017, 10650 : 295 - 308
  • [5] Alignment-based reordering for SMT
    Holmqvist, Maria
    Stymne, Sara
    Ahrenberg, Lars
    Merkel, Magnus
    [J]. LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 3436 - 3440
  • [6] Alignment-based nonmonotonicities in similarity
    Goldstone, RL
    [J]. JOURNAL OF EXPERIMENTAL PSYCHOLOGY-LEARNING MEMORY AND COGNITION, 1996, 22 (04) : 988 - 1001
  • [7] Implementing alignment-based learning
    van Zaanen, M
    [J]. GRAMMATICAL INFERENCE: ALGORITHMS AND APPLICATIONS, 2002, 2484 : 312 - 314
  • [8] Business Alignment-Based Data Warehousing Physical Design Driven by Models
    Simonin, Jacques
    Bigaret, Sebastien
    [J]. 2013 IEEE SEVENTH INTERNATIONAL CONFERENCE ON RESEARCH CHALLENGES IN INFORMATION SCIENCE (RCIS), 2013,
  • [9] An alignment-based account of serial recall
    Dennis, S
    [J]. PROCEEDINGS OF THE TWENTY-FIFTH ANNUAL CONFERENCE OF THE COGNITIVE SCIENCE SOCIETY, PTS 1 AND 2, 2003, : 336 - 341
  • [10] Alignment-based extraction of multiword expressions
    Helena Medeiros de Caseli
    Carlos Ramisch
    Maria das Graças Volpe Nunes
    Aline Villavicencio
    [J]. Language Resources and Evaluation, 2010, 44 : 59 - 77