PepBank - a database of peptides based on sequence text mining and public peptide data sources

被引:140
|
作者
Shtatland, Timur
Guettler, Daniel
Kossodo, Misha
Pivovarov, Misha
Weissleder, Ralph
机构
[1] Harvard Univ, Massachusetts Gen Hosp, Sch Med, Ctr Mol Imaging Res, Charlestown, MA 02129 USA
[2] No Essex Community Coll, Haverhill, MA 01830 USA
来源
BMC BIOINFORMATICS | 2007年 / 8卷
关键词
CURATED DATABASE; MHC-BINDING; PHAGE; DISPLAY; PROTEINS; INFORMATION; RESOURCE; GROWTH; CLASSIFICATION; IDENTIFICATION;
D O I
10.1186/1471-2105-8-280
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Peptides are important molecules with diverse biological functions and biomedical uses. To date, there does not exist a single, searchable archive for peptide sequences or associated biological data. Rather, peptide sequences still have to be mined from abstracts and full-length articles, and/or obtained from the fragmented public sources. Description: We have constructed a new database (PepBank), which at the time of writing contains a total of 19,792 individual peptide entries. The database has a web-based user interface with a simple, Google-like search function, advanced text search, and BLAST and Smith-Waterman search capabilities. The major source of peptide sequence data comes from text mining of MEDLINE abstracts. Another component of the database is the peptide sequence data from public sources (ASPD and UniProt). An additional, smaller part of the database is manually curated from sets of full text articles and text mining results. We show the utility of the database in different examples of affinity ligand discovery. Conclusion: We have created and maintain a database of peptide sequences. The database has biological and medical applications, for example, to predict the binding partners of biologically interesting peptides, to develop peptide based therapeutic or diagnostic agents, or to predict molecular targets or binding specificities of peptides resulting from phage display selection. The database is freely available on http://pepbank.mgh.harvard.edu/, and the text mining source code (Peptide:: Pubmed) is freely available above as well as on CPAN (http://www.cpan.org/).
引用
收藏
页数:10
相关论文
共 50 条
  • [1] PepBank - a database of peptides based on sequence text mining and public peptide data sources
    Timur Shtatland
    Daniel Guettler
    Misha Kossodo
    Misha Pivovarov
    Ralph Weissleder
    BMC Bioinformatics, 8
  • [2] Peptide machines for data mining protein peptides
    Yang, Z. R.
    AMINO ACIDS, 2007, 33 (03) : XII - XII
  • [3] A Public HTLV-1 Molecular Epidemiology Database for Sequence Management and Data Mining
    Almeida Araujo, Thessika Hialla
    Souza-Brito, Leandro Inacio
    Libin, Pieter
    Deforche, Koen
    Edwards, Dustin
    de Albuquerque-Junior, Antonio Eduardo
    Vandamme, Anne-Mieke
    Galvao-Castro, Bernardo
    Junior Alcantara, Luiz Carlos
    PLOS ONE, 2012, 7 (09):
  • [4] Data mining method from text database based on fuzzy quantification analysis
    Aoki, K
    Watada, J
    2004 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN & CYBERNETICS, VOLS 1-7, 2004, : 6472 - 6478
  • [5] Web Database Based on Data Mining
    Yang-bo, Wu
    INFORMATION COMPUTING AND APPLICATIONS, ICICA 2013, PT II, 2013, 392 : 76 - 84
  • [6] APPLICATION OF NEW DATA SOURCES: TEXT MINING AND STORY BANKING
    Hagoort, Karin
    Menger, Vincent
    Velders, Fleur
    Deschamps, Peter
    Scheepers, Floortje E.
    JOURNAL OF THE AMERICAN ACADEMY OF CHILD AND ADOLESCENT PSYCHIATRY, 2018, 57 (10): : S314 - S314
  • [7] Text mining and data information analysis for network public opinion
    Hu Y.
    Data Science Journal, 2019, 18 (01)
  • [8] TDDA, a data mining tool for text databases: A case history in a lung cancer text database
    Goldman, JA
    Chu, W
    Parker, DS
    Goldman, RM
    DISCOVERY SCIENCE, 1998, 1532 : 431 - 432
  • [9] Mining the impact of social media information on public green consumption attitudes: a framework based on ELM and text data mining
    Fan, Jun
    Peng, Lijuan
    Chen, Tinggui
    Cong, Guodong
    HUMANITIES & SOCIAL SCIENCES COMMUNICATIONS, 2024, 11 (01):
  • [10] Mining the impact of social media information on public green consumption attitudes: a framework based on ELM and text data mining
    Jun Fan
    Lijuan Peng
    Tinggui Chen
    Guodong Cong
    Humanities and Social Sciences Communications, 11