On expert curation and scalability: UniProtKB/Swiss-Prot as a case study

被引:81
|
作者
Poux, Sylvain [1 ]
Arighi, Cecilia N. [2 ]
Magrane, Michele [3 ]
Bateman, Alex [3 ]
Wei, Chih-Hsuan [4 ]
Lu, Zhiyong [4 ]
Boutet, Emmanuel [1 ]
Bye-A-Jee, Hema [3 ]
Famiglietti, Maria Livia [1 ]
Roechert, Bernd [1 ]
Martin, Maria Jesus [6 ]
O'Donovan, Claire [6 ]
Alpi, Emanuele [6 ]
Antunes, Ricardo [6 ]
Bely, Benoit [6 ]
Bingley, Mark [6 ]
Bonilla, Carlos [6 ]
Britto, Ramona [6 ]
Bursteinas, Borisas [6 ]
Cowley, Andrew [6 ]
Da Silva, Alan [6 ]
De Giorgi, Maurizio [6 ]
Dogan, Tunca [6 ]
Fazzini, Francesco [6 ]
Castro, Leyla Garcia [6 ]
Figueira, Luis [6 ]
Garmiri, Penelope [6 ]
Georghiou, George [6 ]
Gonzalez, Daniel [6 ]
Hatton-Ellis, Emma [6 ]
Li, Weizhong [6 ]
Liu, Wudong [6 ]
Lopez, Rodrigo [6 ]
Luo, Jie [6 ]
Lussi, Yvonne [6 ]
MacDougall, Alistair [6 ]
Nightingale, Andrew [6 ]
Palka, Barbara [6 ]
Pichler, Klemens [6 ]
Poggioli, Diego [6 ]
Pundir, Sangya [6 ]
Pureza, Luis [6 ]
Qi, Guoying [6 ]
Rosanoff, Steven [6 ]
Saidi, Rabie [6 ]
Sawford, Tony [6 ]
Shypitsyna, Aleksandra [6 ]
Speretta, Elena [6 ]
Turner, Edward [6 ]
Tyagi, Nidhi [6 ]
机构
[1] Ctr Med Univ Geneva, SIB Swiss Inst Bioinformat, Swiss Prot Grp, CH-1211 Geneva 4, Switzerland
[2] Univ Delaware, Prot Informat Resource, Newark, DE 19711 USA
[3] EBI, EMBL, Wellcome Genome Campus, Cambridge CB10 1SD, England
[4] US Natl Lib Med, NCBI, Bethesda, MD 20894 USA
[5] Georgetown Univ, Prot Informat Resource, Med Ctr, Washington, DC 20007 USA
[6] European Bioinformat Inst, Cambridge, England
[7] SIB, Lausanne, Switzerland
[8] Prot Informat Resource, Newark, DE USA
基金
美国国家卫生研究院;
关键词
DENND1B; KI-67;
D O I
10.1093/bioinformatics/btx439
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Biological knowledgebases, such as UniProtKB/Swiss-Prot, constitute an essential component of daily scientific research by offering distilled, summarized and computable knowledge extracted from the literature by expert curators. While knowledgebases play an increasingly important role in the scientific community, their ability to keep up with the growth of biomedical literature is under scrutiny. Using UniProtKB/Swiss-Prot as a case study, we address this concern via multiple literature triage approaches. Results: With the assistance of the PubTator text-mining tool, we tagged more than 10 000 articles to assess the ratio of papers relevant for curation. We first show that curators read and evaluate many more papers than they curate, and that measuring the number of curated publications is insufficient to provide a complete picture as demonstrated by the fact that 8000-10000 papers are curated in UniProt each year while curators evaluate 50 000-70 000 papers per year. We show that 90% of the papers in PubMed are out of the scope of UniProt, that a maximum of 2-3% of the papers indexed in PubMed each year are relevant for UniProt curation, and that, despite appearances, expert curation in UniProt is scalable.
引用
收藏
页码:3454 / 3460
页数:7
相关论文
共 50 条
  • [1] Genetic Variations and Diseases in UniProtKB/Swiss-Prot: The Ins and Outs of Expert Manual Curation
    Famiglietti, Maria Livia
    Estreicher, Anne
    Gos, Arnaud
    Bolleman, Jerven
    Gehant, Sebastien
    Breuza, Lionel
    Bridge, Alan
    Poux, Sylvain
    Redaschi, Nicole
    Bougueleret, Lydie
    Xenarios, Ioannis
    HUMAN MUTATION, 2014, 35 (08) : 927 - 935
  • [2] UNIPROTKB/SWISS-PROT and the human proteomics initiative
    Jimenez, S.
    Rojas, S. Ferro
    Bairoch, A.
    MOLECULAR & CELLULAR PROTEOMICS, 2005, 4 (08) : S23 - S23
  • [3] UniProtKB/Swiss-Prot: from sequences to functions
    Bairoch, A.
    FEBS JOURNAL, 2009, 276 : 9 - 9
  • [4] UniProtKB/Swiss-Prot: the protein sequence knowledgebase
    Stutz, A.
    Bairoch, A.
    Estreicher, A.
    FEBS JOURNAL, 2006, 273 : 62 - 62
  • [5] UniProtKB/Swiss-Prot: New and future developments
    Bairoch, Amos
    DATA INTEGRATION IN THE LIFE SCIENCES, PROCEEDINGS, 2008, 5109 : 204 - 206
  • [6] Mining the UniProtKB/Swiss-Prot database for antimicrobial peptides
    Li, Chenkai
    Sutherland, Darcy
    Salehi, Ali
    Richter, Amelia
    Lin, Diana
    Aninta, Sambina Islam
    Ebrahimikondori, Hossein
    Yanai, Anat
    Coombe, Lauren
    Warren, Rene L.
    Kotkoff, Monica
    Hoang, Linda M. N.
    Helbing, Caren C.
    Birol, Inanc
    PROTEIN SCIENCE, 2025, 34 (04)
  • [7] The Plant Proteome Annotation Program (PPAP) of UniProtKB/Swiss-Prot
    Lieberherr, D
    Boutet, E
    Tognolli, M
    Schneider, M
    Bairoch, A
    FEBS JOURNAL, 2005, 272 : 110 - 110
  • [8] The UniProtKB/Swiss-Prot knowledgebase and its Plant Proteome Annotation Program
    Schneider, Michel
    Lane, Lydie
    Boutet, Emmanuel
    Lieberherr, Damien
    Tognolli, Michael
    Bougueleret, Lydie
    Baiyoch, Amos
    JOURNAL OF PROTEOMICS, 2009, 72 (03) : 567 - 573
  • [9] Collaborative annotation of genes and proteins between UniProtKB/Swiss-Prot and dictyBase
    Gaudet, P.
    Lane, L.
    Fey, P.
    Bridge, A.
    Poux, S.
    Auchincloss, A.
    Axelsen, K.
    Quintaje, S. Braconi
    Boutet, E.
    Brown, P.
    Coudert, E.
    Datta, R. S.
    de Lima, W. C.
    Lima, T. de Oliveira
    Duvaud, S.
    Farriol-Mathis, N.
    Rojas, S. Ferro
    Feuermann, M.
    Gateau, A.
    Hinz, U.
    Hulo, C.
    James, J.
    Jimenez, S.
    Jungo, F.
    Keller, G.
    Lemercier, P.
    Lieberherr, D.
    Moinat, M.
    Nikolskaya, A.
    Pedruzzi, I.
    Rivoire, C.
    Roechert, B.
    Schneider, M.
    Stanley, E.
    Tognolli, M.
    Sjoelander, K.
    Bougueleret, L.
    Chisholm, R. L.
    Bairoch, A.
    DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2009,
  • [10] Swiss-prot
    Telcom Report (English Edition), 1991, 14 (01):