Using Sequence Similarity Networks for Visualization of Relationships Across Diverse Protein Superfamilies

被引:334
|
作者
Atkinson, Holly J. [1 ,1 ]
Morris, John H. [2 ]
Ferrin, Thomas E. [1 ,2 ,3 ]
Babbitt, Patricia C. [1 ,2 ,3 ]
机构
[1] Univ Calif San Francisco, Inst Quantitative Biosci, San Francisco, CA 94143 USA
[2] Univ Calif San Francisco, Dept Pharm Chem, San Francisco, CA 94143 USA
[3] Univ Calif San Francisco, Dept Biopharm Sci, San Francisco, CA 94143 USA
来源
PLOS ONE | 2009年 / 4卷 / 02期
关键词
CRYSTAL-STRUCTURE; ALGORITHM; DATABASE; GENERATION; RECEPTORS; CYTOSCAPE; INFERENCE; FAMILIES; TOOLS; CLANS;
D O I
10.1371/journal.pone.0004345
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The dramatic increase in heterogeneous types of biological data-in particular, the abundance of new protein sequences requires fast and user-friendly methods for organizing this information in a way that enables functional inference. The most widely used strategy to link sequence or structure to function, homology-based function prediction, relies on the fundamental assumption that sequence or structural similarity implies functional similarity. New tools that extend this approach are still urgently needed to associate sequence data with biological information in ways that accommodate the real complexity of the problem, while being accessible to experimental as well as computational biologists. To address this, we have examined the application of sequence similarity networks for visualizing functional trends across protein superfamilies from the context of sequence similarity. Using three large groups of homologous proteins of varying types of structural and functional diversity-GPCRs and kinases from humans, and the crotonase superfamily of enzymes-we show that overlaying networks with orthogonal information is a powerful approach for observing functional themes and revealing outliers. In comparison to other primary methods, networks provide both a good representation of group-wise sequence similarity relationships and a strong visual and quantitative correlation with phylogenetic trees, while enabling analysis and visualization of much larger sets of sequences than trees or multiple sequence alignments can easily accommodate. We also define important limitations and caveats in the application of these networks. As a broadly accessible and effective tool for the exploration of protein superfamilies, sequence similarity networks show great potential for generating testable hypotheses about protein structure-function relationships.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Exploring the sequence, function, and evolutionary space of protein superfamilies using sequence similarity networks and phylogenetic reconstructions
    Copp, Janine N.
    Anderson, Dave W.
    Akiva, Eyal
    Babbitt, Patricia C.
    Tokuriki, Nobuhiko
    [J]. NEW APPROACHES FOR FLAVIN CATALYSIS, 2019, 620 : 315 - 347
  • [2] Sequence Similarity Networks for the Protein Universe
    Whalen, Katie
    Sadkhin, Boris
    Davidson, Daniel
    Gerlt, John
    [J]. FASEB JOURNAL, 2015, 29
  • [3] Analysis of protein sequence/structure similarity relationships
    Gan, HH
    Perlow, RA
    Roy, S
    Ko, J
    Wu, M
    Huang, J
    Yan, SX
    Nicoletta, A
    Vafai, J
    Sun, D
    Wang, LH
    Noah, JE
    Pasquali, S
    Schlick, T
    [J]. BIOPHYSICAL JOURNAL, 2002, 83 (05) : 2781 - 2791
  • [4] Tree visualizations of protein sequence embedding space enable improved functional clustering of diverse protein superfamilies
    Yeung, Wayland
    Zhou, Zhongliang
    Mathew, Liju
    Gravel, Nathan
    Taujale, Rahil
    O'Boyle, Brady
    Salcedo, Mariah
    Venkat, Aarya
    Lanzilotta, William
    Li, Sheng
    Kannan, Natarajan
    [J]. BRIEFINGS IN BIOINFORMATICS, 2023, 24 (01)
  • [5] Protein Similarity Networks Reveal Relationships among Sequence, Structure, and Function within the Cupin Superfamily
    Uberto, Richard
    Moomaw, Ellen W.
    [J]. PLOS ONE, 2013, 8 (09):
  • [6] Definition of the tempo of sequence diversity across an alignment and automatic identification of sequence motifs: Application to protein homologous families and superfamilies
    May, ACW
    [J]. PROTEIN SCIENCE, 2002, 11 (12) : 2825 - 2835
  • [7] Effusion: prediction of protein function from sequence similarity networks
    Yunes, Jeffrey M.
    Babbitt, Patricia C.
    [J]. BIOINFORMATICS, 2019, 35 (03) : 442 - 451
  • [8] PASS: Protein Annotation Surveillance Site for Protein Annotation Using Homologous Clusters, NLP, and Sequence Similarity Networks
    Tao, Jin
    Brayton, Kelly A.
    Broschat, Shira L.
    [J]. FRONTIERS IN BIOINFORMATICS, 2021, 1
  • [9] Protein Sequence Similarity Analysis Using Computational Techniques
    Nikhila, K. S.
    Nair, Vrinda V.
    [J]. MATERIALS TODAY-PROCEEDINGS, 2018, 5 (01) : 724 - 731
  • [10] Classification of Myoviridae bacteriophages using protein sequence similarity
    Rob Lavigne
    Paul Darius
    Elizabeth J Summer
    Donald Seto
    Padmanabhan Mahadevan
    Anders S Nilsson
    Hans W Ackermann
    Andrew M Kropinski
    [J]. BMC Microbiology, 9