ProtoMap: automatic classification of protein sequences and hierarchy of protein families

被引:108
|
作者
Yona, G
Linial, N
Linial, M
机构
[1] Stanford Univ, Dept Biol Struct, Stanford, CA 94305 USA
[2] Hebrew Univ Jerusalem, Inst Comp Sci, IL-91904 Jerusalem, Israel
[3] Hebrew Univ Jerusalem, Inst Life Sci, Dept Biol Chem, IL-91904 Jerusalem, Israel
关键词
D O I
10.1093/nar/28.1.49
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The ProtoMap site offers an exhaustive classification of all proteins in the SWISS-PROT database, into groups of related proteins. The classification is based on analysis of all pairwise similarities among protein sequences, The analysis makes essential use of transitivity to identify homologies among proteins. Within each group of the classification, every two members are either directly or transitively related. However, transitivity is applied restrictively in order to prevent unrelated proteins from clustering together, The classification is done at different levels of confidence, and yields a hierarchical organization of all:proteins. The resulting classification splits the protein space into well-defined groups of proteins, which are closely correlated with natural biological families and superfamilies. Many clusters contain protein sequences that are not classified by other databases. The hierarchical organization suggested by our analysis may help in detecting finer subfamilies in families of known proteins. In addition it brings forth interesting relationships between protein families, upon which local maps for the neighborhood of protein families can be sketched. The ProtoMap web server can be accessed at http://www.protomap.cs.huji.ac.il.
引用
收藏
页码:49 / 55
页数:7
相关论文
共 50 条
  • [21] Classification algorithms and analyzing the functionality of protein families
    Gao, L
    Chiu, DKY
    DATA MINING VI: DATA MINING, TEXT MINING AND THEIR BUSINESS APPLICATIONS, 2005, : 431 - 443
  • [22] Rate matrices for analyzing large families of protein sequences
    Devauchelle, C
    Grossmann, A
    Hénaut, A
    Holschneider, M
    Monnerot, M
    Risler, JL
    Torrésani, B
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2001, 8 (04) : 381 - 399
  • [23] HOW FREQUENT ARE CORRELATED CHANGES IN FAMILIES OF PROTEIN SEQUENCES
    NEHER, E
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1994, 91 (01) : 98 - 102
  • [24] Automatic generation and evaluation of sparse protein signatures for families of protein structural domains
    Blades, MJ
    Ison, JC
    Ranasinghe, R
    Findlay, JBC
    PROTEIN SCIENCE, 2005, 14 (01) : 13 - 23
  • [25] Profiling Protein Families from Partially Aligned Sequences
    Mukherjee, Saikat
    Zhao, Chang
    Ramakrishnan, I. V.
    PROCEEDINGS OF THE SIXTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2006, : 584 - +
  • [26] Classification of protein sequences by means of irredundant patterns
    Matteo Comin
    Davide Verzotto
    BMC Bioinformatics, 11
  • [27] Classification of protein sequences by means of irredundant patterns
    Comin, Matteo
    Verzotto, Davide
    BMC BIOINFORMATICS, 2010, 11
  • [28] Optimized Tree-Classification Algorithm for Classification of Protein Sequences
    Iqbal, Muhammad Javed
    Faye, Ibrahima
    Said, Abas Md
    Samir, Brahim Belhaouari
    2015 INTERNATIONAL SYMPOSIUM ON MATHEMATICAL SCIENCES AND COMPUTING RESEARCH (ISMSC), 2015, : 110 - 115
  • [29] Protein conformation families for automatic model building.
    Pavelcik, Frantisek
    ACTA CRYSTALLOGRAPHICA A-FOUNDATION AND ADVANCES, 2007, 63 : S114 - S114
  • [30] Rapid automatic detection and alignment of repeats in protein sequences
    Heger, A
    Holm, L
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2000, 41 (02) : 224 - 237