Mining super-secondary structure motifs from 3D protein structures: A sequence order independent approach

被引:0
|
作者
Aung, Zeyar [1 ]
Li, Jinyan [2 ]
机构
[1] Inst Infocomm Res, 21 Heng Mui Keng Terrace, Singapore 119613, Singapore
[2] Nanyang Technol Univ, Sch Comp Engn, Singapore 639798, Singapore
来源
关键词
3D protein structure; super-secondary structure; structural motifs mining; DISCOVERY; PACKING; ALGORITHM;
D O I
暂无
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Super-Secondary structure elements (super-SSEs) are the structurally conserved ensembles of secondary structure elements (SSEs) within a protein. They are of great biological interest. In this work, we present a method to formally represent and mine the sequence order independent super-SSE motifs that occur repeatedly in large data sets of protein structures. We represent a protein structure as a graph, and mine the common cliques from a set of protein graphs in order to find the motifs. We mine two categories of super-SSE motifs: the generic motifs that occur frequently across the entire database of protein structures, and the fold-preferential motifs that are concentrated in particular protein fold types. From the experimental data set of 600 proteins belonging to 15 large SCOP Folds, we have discovered 21 generic motifs and 75 fold-preferential motifs that are both statistically significant and biologically relevant. A number of the discovered motifs (both generic and fold-preferential) resemble the well-known super-SSE motifs in the literature such as beta hairpins, Greek keys, zinc fingers, etc. Some of the discovered motifs are of novel shapes that have not been documented yet. Our method is time-efficient where it can discover all the motifs across the 600 proteins in less than 14 minutes on a standalone PC. The discovered motifs are reported in our project webpage: http://www1.i2r.a-star.edu.sg/similar to azeyar/SuperSSE/.
引用
收藏
页码:15 / +
页数:3
相关论文
共 50 条
  • [1] Protein Structure Determination by Assembling Super-Secondary Structure Motifs Using Pseudocontact Shifts
    Pilla, Kala Bharath
    Otting, Gottfried
    Huber, Thomas
    STRUCTURE, 2017, 25 (03) : 559 - 568
  • [2] Motif3D: relating protein sequence motifs to 3D structure
    Gaulton, A
    Attwood, TK
    NUCLEIC ACIDS RESEARCH, 2003, 31 (13) : 3333 - 3336
  • [3] 3-D, SEQUENCE-ORDER-INDEPENDENT COMPARISONS OF PROTEIN STRUCTURES
    FISCHER, D
    TSAI, CJ
    LIN, SL
    WOLFSON, H
    NUSSINOV, R
    BIOPHYSICAL JOURNAL, 1994, 66 (02) : A62 - A62
  • [4] seeMotif: exploring and visualizing sequence motifs in 3D structures
    Chang, Darby Tien-Hao
    Chien, Ting-Ying
    Chen, Chien-Yu
    NUCLEIC ACIDS RESEARCH, 2009, 37 : W552 - W558
  • [5] RNA-MoIP: prediction of RNA secondary structure and local 3D motifs from sequence data
    Yao, Jason
    Reinharz, Vladimir
    Major, Francois
    Waldispuhl, Jerome
    NUCLEIC ACIDS RESEARCH, 2017, 45 (W1) : W440 - W444
  • [6] Assessing a novel approach for predicting local 3D protein structures from sequence
    Benros, C
    de Brevern, AG
    Etchebest, C
    Hazout, S
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2006, 62 (04) : 865 - 880
  • [7] Smoothing 3D Protein Structure Motifs Through Graph Mining and Amino Acid Similarities
    Dhifli, Wajdi
    Saidi, Rabie
    Nguifo, Engelbert Mephu
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2014, 21 (02) : 162 - 172
  • [8] On the use of direct-coupling analysis with a reduced alphabet of amino acids combined with super-secondary structure motifs for protein fold prediction
    Anton, Bernat
    Besalu, Mireia
    Fornes, Oriol
    Bonet, Jaume
    Molina, Alexis
    Molina-Fernandez, Ruben
    De las Cuevas, Gemma
    Fernandez-Fuentes, Narcis
    Oliva, Baldo
    NAR GENOMICS AND BIOINFORMATICS, 2021, 3 (02)
  • [9] MINING SEQUENCE MOTIFS FROM PROTEIN DATABASES BASED ON A BIT PATTERN APPROACH
    Chang, Ye-In
    Wu, Chen-Chang
    Chen, Jiun-Rung
    Jeng, Yin-Han
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2012, 8 (1B): : 647 - 657
  • [10] ESPript/ENDscript: Sequence and 3D Information from Protein Structures
    Gouet, Patrice
    Robert, Xavier
    Courcelle, Emmanuel
    ACTA CRYSTALLOGRAPHICA A-FOUNDATION AND ADVANCES, 2005, 61 : C42 - C43