An evaluation of GO annotation retrieval for BioCreAtIvE and GOA

被引:0
|
作者
Camon, EB [1 ]
Barrell, DG [1 ]
Dimmer, EC [1 ]
Lee, V [1 ]
Magrane, M [1 ]
Maslen, J [1 ]
Binns, D [1 ]
Apweiler, R [1 ]
机构
[1] European Mol Biol Lab, European Bioinformat Inst, Cambridge CB10 1SD, England
来源
BMC BIOINFORMATICS | 2005年 / 6卷
关键词
D O I
暂无
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The Gene Ontology Annotation (GOA) database http://www.ebi.ac.uk/GOA aims to provide high-quality supplementary GO annotation to proteins in the UniProt Knowledgebase. Like many other biological databases, GOA gathers much of its content from the careful manual curation of literature. However, as both the volume of literature and of proteins requiring characterization increases, the manual processing capability can become overloaded. Consequently, semi-automated aids are often employed to expedite the curation process. Traditionally, electronic techniques in GOA depend largely on exploiting the knowledge in existing resources such as InterPro. However, in recent years, text mining has been hailed as a potentially useful tool to aid the curation process. To encourage the development of such tools, the GOA team at EBI agreed to take part in the functional annotation task of the BioCreAtIvE (Critical Assessment of Information Extraction systems in Biology) challenge. BioCreAtIvE task 2 was an experiment to test if automatically derived classification using information retrieval and extraction could assist expert biologists in the annotation of the GO vocabulary to the proteins in the UniProt Knowledgebase. GOA provided the training corpus of over 9000 manual GO annotations extracted from the literature. For the test set, we provided a corpus of 200 new Journal of Biological Chemistry articles used to annotate 286 human proteins with GO terms. A team of experts manually evaluated the results of 9 participating groups, each of which provided highlighted sentences to support their GO and protein annotation predictions. Here, we give a biological perspective on the evaluation, explain how we annotate GO using literature and offer some suggestions to improve the precision of future text-retrieval and extraction techniques. Finally, we provide the results of the first inter-annotator agreement study for manual GO curation, as well as an assessment of our current electronic GO annotation strategies. Results: The GOA database currently extracts GO annotation from the literature with 91 to 100% precision, and at least 72% recall. This creates a particularly high threshold for text mining systems which in BioCreAtIvE task 2 (GO annotation extraction and retrieval) initial results precisely predicted GO terms only 10 to 20% of the time. Conclusion: Improvements in the performance and accuracy of text mining for GO terms should be expected in the next BioCreAtIvE challenge. In the meantime the manual and electronic GO annotation strategies already employed by GOA will provide high quality annotations.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] An evaluation of GO annotation retrieval for BioCreAtIvE and GOA
    Camon, Evelyn B.
    Barrell, Daniel G.
    Dimmer, Emily C.
    Lee, Vivian
    Magrane, Michele
    Maslen, John
    Binns, David
    Apweiler, Rolf
    BMC Bioinform., SUPPL.1
  • [2] An evaluation of GO annotation retrieval for BioCreAtIvE and GOA
    Evelyn B Camon
    Daniel G Barrell
    Emily C Dimmer
    Vivian Lee
    Michele Magrane
    John Maslen
    David Binns
    Rolf Apweiler
    BMC Bioinformatics, 6 (Suppl 1)
  • [3] The gene ontology annotation (GOA) project: Implementation of GO in SWISS-PROT, TrEMBL, and InterPro
    Camon, E
    Magrane, M
    Barrell, D
    Binns, D
    Fleischmann, W
    Kersey, P
    Mulder, N
    Oinn, T
    Maslen, J
    Cox, A
    Apweiler, R
    GENOME RESEARCH, 2003, 13 (04) : 662 - 672
  • [4] The Gene Ontology Annotation (GOA) project - application of GO in SWISS-PROT, TrEMBL and InterPro
    Camon, E
    Barrell, D
    Brooksbank, C
    Magrane, M
    Apweiler, R
    COMPARATIVE AND FUNCTIONAL GENOMICS, 2003, 4 (01): : 71 - 74
  • [5] The Gene Ontology Annotation (GOA) project at the EBI
    Huntley, R.
    Dimmer, E.
    Camon, E.
    Barrell, D.
    Apweiler, R.
    MOLECULAR & CELLULAR PROTEOMICS, 2006, 5 (10) : S354 - S354
  • [6] Search System, Access and Visualization in GOA annotation
    Romero, Ruben
    Glez-Pena, Daniel
    Ferreiro, P.
    Mendez, Jose R.
    Fdez-Riverola, F.
    ACTAS DE LA III CONFERENCIA IBERICA DE SISTEMAS Y TECNOLOGIAS DE LA INFORMACION, VOL 2, 2008, : 985 - 996
  • [7] MORE ON THE GOA-BMA-GO
    WEIGAND, H
    ANASTHESIOLOGIE & INTENSIVMEDIZIN, 1991, 32 (09): : 268 - 272
  • [8] Evaluation of BioCreAtIvE assessment of task 2
    Christian Blaschke
    Eduardo Andres Leon
    Martin Krallinger
    Alfonso Valencia
    BMC Bioinformatics, 6 (Suppl 1)
  • [9] Evaluation of BioCreAtIvE assessment of task 2
    Blaschke, C
    Leon, EA
    Krallinger, M
    Valencia, A
    BMC BIOINFORMATICS, 2005, 6
  • [10] The GOA database: Gene Ontology annotation updates for 2015
    Huntley, Rachael P.
    Sawford, Tony
    Mutowo-Meullenet, Prudence
    Shypitsyna, Aleksandra
    Bonilla, Carlos
    Martin, Maria J.
    O'Donovan, Claire
    NUCLEIC ACIDS RESEARCH, 2015, 43 (D1) : D1057 - D1063