NMPFamsDB: a database of novel protein families from microbial metagenomes and metatranscriptomes

被引:4
|
作者
Baltoumas, Fotis A. [1 ]
Karatzas, Evangelos [1 ]
Liu, Sirui [2 ]
Ovchinnikov, Sergey [2 ]
Sofianatos, Yorgos [1 ]
Chen, I-Min [3 ]
Kyrpides, Nikos C. [3 ]
Pavlopoulos, Georgios A. [1 ,3 ,4 ,5 ]
机构
[1] BSRC Alexander Fleming, Inst Fundamental Biomed Res, Vari 16672, Greece
[2] Harvard Univ, John Harvard Distinguished Sci Fellowship Program, Cambridge, MA 02138 USA
[3] Lawrence Berkeley Natl Lab, DOE Joint Genome Inst, 1 Cyclotron Rd, Berkeley, CA 94720 USA
[4] Natl & Kapodistrian Univ Athens, Ctr New Biotechnol & Precis Med, Sch Med, 75 Mikras Asias St, Athens 11527, Greece
[5] BSRC Alexander Fleming, Inst Fundamental Biomed Res, 34 Fleming St, Vari 16672, Greece
关键词
SECONDARY STRUCTURE; NEURAL-NETWORKS; CLASSIFICATION; VISUALIZATION; PREDICTION; ALGORITHM; INSIGHTS; TOOLS; ALIGN; SCOPE;
D O I
10.1093/nar/gkad800
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The Novel Metagenome Protein Families Database (NMPFamsDB) is a database of metagenome- and metatranscriptome-derived protein families, whose members have no hits to proteins of reference genomes or Pfam domains. Each protein family is accompanied by multiple sequence alignments, Hidden Markov Models, taxonomic information, ecosystem and geolocation metadata, sequence and structure predictions, as well as 3D structure models predicted with AlphaFold2. In its current version, NMPFamsDB hosts over 100 000 protein families, each with at least 100 members. The reported protein families significantly expand (more than double) the number of known protein sequence clusters from reference genomes and reveal new insights into their habitat distribution, origins, functions and taxonomy. We expect NMPFamsDB to be a valuable resource for microbial proteome-wide analyses and for further discovery and characterization of novel functions. NMPFamsDB is publicly available in http://www.nmpfamsdb.org/ or https://bib.fleming.gr/NMPFamsDB. Graphical Abstract
引用
收藏
页码:D502 / D512
页数:11
相关论文
共 50 条
  • [21] Pfam: the protein families database
    Finn, Robert D.
    Bateman, Alex
    Clements, Jody
    Coggill, Penelope
    Eberhardt, Ruth Y.
    Eddy, Sean R.
    Heger, Andreas
    Hetherington, Kirstie
    Holm, Liisa
    Mistry, Jaina
    Sonnhammer, Erik L. L.
    Tate, John
    Punta, Marco
    NUCLEIC ACIDS RESEARCH, 2014, 42 (D1) : D222 - D230
  • [22] The Pfam Protein Families Database
    Bateman, A
    Birney, E
    Cerruti, L
    Durbin, R
    Etwiller, L
    Eddy, SR
    Griffiths-Jones, S
    Howe, KL
    Marshall, M
    Sonnhammer, ELL
    NUCLEIC ACIDS RESEARCH, 2002, 30 (01) : 276 - 280
  • [23] Systematic artifacts in metagenomes from complex microbial communities
    Vicente Gomez-Alvarez
    Tracy K Teal
    Thomas M Schmidt
    The ISME Journal, 2009, 3 : 1314 - 1317
  • [24] Benchmarking microbial growth rate predictions from metagenomes
    Andrew M. Long
    Shengwei Hou
    J. Cesar Ignacio-Espinoza
    Jed A. Fuhrman
    The ISME Journal, 2021, 15 : 183 - 195
  • [25] Autometa: Automated extraction of microbial genomes from metagenomes
    Miller, I. J.
    Lopera, J.
    Miller, I.
    Montgomery, K.
    Puglisi, M.
    Kirby, R.
    Rose, W.
    Rey, F.
    Kwan, J. C.
    PLANTA MEDICA, 2016, 82
  • [26] Systematic artifacts in metagenomes from complex microbial communities
    Gomez-Alvarez, Vicente
    Teal, Tracy K.
    Schmidt, Thomas M.
    ISME JOURNAL, 2009, 3 (11): : 1314 - 1317
  • [27] Metagenomes and metatranscriptomes from the L4 long-term coastal monitoring station in the Western English Channel
    Gilbert, Jack A.
    Meyer, Folker
    Schriml, Lynn
    Joint, Ian R.
    Muehling, Martin
    Field, Dawn
    STANDARDS IN GENOMIC SCIENCES, 2010, 3 (02): : 183 - 193
  • [28] Deconvolution of complete microbial genomes from shotgun metagenomes
    Miller, I. J.
    Lopera, J. G.
    Montgomery, K.
    Puglisi, M.
    Rose, W.
    Kwan, J. C.
    PLANTA MEDICA, 2016, 82
  • [29] Metagenomes and metatranscriptomes from the L4 long-term coastal monitoring station in the Western English Channel
    Jack A. Gilbert
    Folker Meyer
    Lynn Schriml
    Ian R. Joint
    Martin Mühling
    Dawn Field
    Standards in Genomic Sciences, 2010, 3 : 183 - 193
  • [30] Benchmarking microbial growth rate predictions from metagenomes
    Long, Andrew M.
    Hou, Shengwei
    Ignacio-Espinoza, J. Cesar
    Fuhrman, Jed A.
    ISME JOURNAL, 2021, 15 (01): : 183 - 195