Gene3D: merging structure and function for a Thousand genomes

被引:37
|
作者
Lees, Jonathan [1 ]
Yeats, Corin [1 ]
Redfern, Oliver [1 ]
Clegg, Andrew [1 ]
Orengo, Christine [1 ]
机构
[1] UCL, Dept Biochem & Mol Biol, London WC1 6BT, England
基金
美国国家卫生研究院;
关键词
PROTEIN; RESOURCE; RECOGNITION; PREDICTION; SEQUENCE; DATABASE;
D O I
10.1093/nar/gkp987
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Over the last 2 years the Gene3D resource has been significantly improved, and is now more accurate and with a much richer interactive display via the Gene3D website (http://gene3d.biochem.ucl.ac.uk/). Gene3D provides accurate structural domain family assignments for over 1100 genomes and nearly 10 000 000 proteins. A hidden Markov model library, constructed from the manually curated CATH structural domain hierarchy, is used to search UniProt, RefSeq and Ensembl protein sequences. The resulting matches are refined into simple multi-domain architectures using a recently developed in-house algorithm, DomainFinder 3 (available at: ftp://ftp.biochem.ucl.ac.uk/pub/gene3d_data/DomainFinder3/). The domain assignments are integrated with multiple external protein function descriptions (e. g. Gene Ontology and KEGG), structural annotations (e. g. coiled coils, disordered regions and sequence polymorphisms) and family resources (e. g. Pfam and eggNog) and displayed on the Gene3D website. The website allows users to view descriptions for both single proteins and genes and large protein sets, such as superfamilies or genomes. Subsets can then be selected for detailed investigation or associated functions and interactions can be used to expand explorations to new proteins. Gene3D also provides a set of services, including an interactive genome coverage graph visualizer, DAS annotation resources, sequence search facilities and SOAP services.
引用
收藏
页码:D296 / D300
页数:5
相关论文
共 50 条
  • [1] Gene3D: modelling protein structure, function and evolution
    Yeats, Corin
    Maibaum, Michael
    Marsden, Russell
    Dibley, Mark
    Lee, David
    Addou, Sarah
    Orengo, Christine A.
    NUCLEIC ACIDS RESEARCH, 2006, 34 : D281 - D284
  • [2] Gene3D: comprehensive structural and functional annotation of genomes
    Yeats, Corin
    Lees, Jonathan
    Reid, Adam
    Kellam, Paul
    Martin, Nigel
    Liu, Xinhui
    Orengo, Christine
    NUCLEIC ACIDS RESEARCH, 2008, 36 : D414 - D418
  • [3] Gene3D: Structural assignment for whole genes and genomes using the CATH domain structure database
    Buchan, DWA
    Shepherd, AJ
    Lee, D
    Pearl, FMG
    Rison, SCG
    Thornton, JM
    Orengo, CA
    GENOME RESEARCH, 2002, 12 (03) : 503 - 514
  • [4] Identification and distribution of protein families in 120 completed genomes using Gene3D
    Lee, D
    Grant, A
    Marsden, RL
    Orengo, C
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2005, 59 (03) : 603 - 615
  • [5] Predicting protein function with hierarchical phylogenetic profiles: The Gene3D phylo-tuner method applied to eukaryotic Genomes
    Ranea, Juan A. G.
    Yeats, Corin
    Grant, Alastair
    Orengo, Christine A.
    PLOS COMPUTATIONAL BIOLOGY, 2007, 3 (11) : 2366 - 2378
  • [6] Gene3D: expanding the utility of domain assignments
    Lam, Su Datt
    Dawson, Natalie L.
    Das, Sayoni
    Sillitoe, Ian
    Ashford, Paul
    Lee, David
    Lehtinen, Sonja
    Orengo, Christine A.
    Lees, Jonathan G.
    NUCLEIC ACIDS RESEARCH, 2016, 44 (D1) : D404 - D409
  • [7] Gene3D: structural assignments for the biologist and bioinformaticist alike
    Buchan, DWA
    Rison, SCG
    Bray, JE
    Lee, D
    Pearl, F
    Thornton, JM
    Orengo, CA
    NUCLEIC ACIDS RESEARCH, 2003, 31 (01) : 469 - 473
  • [8] The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis
    Pearl, F
    Todd, A
    Sillitoe, I
    Dibley, M
    Redfern, O
    Lewis, T
    Bennett, C
    Marsden, R
    Grant, A
    Lee, D
    Akpor, A
    Maibaum, M
    Harrison, A
    Dallman, T
    Reeves, G
    Diboun, I
    Addou, S
    Lise, S
    Johnston, C
    Sillero, A
    Thornton, J
    Orengo, C
    NUCLEIC ACIDS RESEARCH, 2005, 33 : D247 - D251
  • [9] Gene3D: Multi-domain annotations for protein sequence and comparative genome analysis
    Lees, Jonathan G.
    Lee, David
    Studer, Romain A.
    Dawson, Natalie L.
    Sillitoe, Ian
    Das, Sayoni
    Yeats, Corin
    Dessailly, Benoit H.
    Rentzsch, Robert
    Orengo, Christine A.
    NUCLEIC ACIDS RESEARCH, 2014, 42 (D1) : D240 - D245
  • [10] Gene3D: Extensive prediction of globular domains in proteins (vol 46, pg 435, 2017)
    Lewis, Tony E.
    Sillitoe, Ian
    Dawson, Natalie
    Lam, Su Datt
    Clarke, Tristan
    Lee, David
    Orengo, Christine
    Lees, Jonathan
    NUCLEIC ACIDS RESEARCH, 2018, 46 (D1) : D1282 - D1282