Sequence-structure relationship study in all-α transmembrane proteins using an unsupervised learning approach

被引:7
|
作者
Esque, Jeremy [1 ,2 ,3 ,4 ,5 ,6 ,7 ,8 ]
Urbain, Aurelie [9 ]
Etchebest, Catherine [1 ,2 ,3 ,4 ]
de Brevern, Alexandre G. [1 ,2 ,3 ,4 ]
机构
[1] INSERM, DSIMB, U 1134, F-75739 Paris, France
[2] Univ Paris Diderot, Sorbonne Paris Cite, UMR S 1134, F-75739 Paris, France
[3] INTS, F-75739 Paris, France
[4] Lab Excellence GR Ex, F-75739 Paris, France
[5] ISIS, Lab Ingn Fonct Mol IFM, UMR 7006, F-67000 Strasbourg, France
[6] INSERM, IGBMC, Dept Integrat Struct Biol, U964, F-67404 Illkirch Graffenstaden, France
[7] CNRS, UMR7104, F-67404 Illkirch Graffenstaden, France
[8] Univ Strasbourg, F-67404 Illkirch Graffenstaden, France
[9] INRA, Inst Jean Pierre Bourgin, UMR 1318, F-78026 Versailles, France
关键词
Transmembrane protein; Learning approach; Sequence-structure relationship; Protein structure; Artificial neural network; Hybrid protein model; Structural alphabet; Classification; PREDICTING RELIABLE REGIONS; ABC TRANSPORTER MSBA; AMINO-ACID; MULTIPLE ALIGNMENT; MEMBRANE-PROTEINS; MOTIFS; DATABASE; HELICES; MATRIX; TM;
D O I
10.1007/s00726-015-2010-5
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Transmembrane proteins (TMPs) are major drug targets, but the knowledge of their precise topology structure remains highly limited compared with globular proteins. In spite of the difficulties in obtaining their structures, an important effort has been made these last years to increase their number from an experimental and computational point of view. In view of this emerging challenge, the development of computational methods to extract knowledge from these data is crucial for the better understanding of their functions and in improving the quality of structural models. Here, we revisit an efficient unsupervised learning procedure, called Hybrid Protein Model (HPM), which is applied to the analysis of transmembrane proteins belonging to the all-alpha structural class. HPM method is an original classification procedure that efficiently combines sequence and structure learning. The procedure was initially applied to the analysis of globular proteins. In the present case, HPM classifies a set of overlapping protein fragments, extracted from a non-redundant databank of TMP 3D structure. After fine-tuning of the learning parameters, the optimal classification results in 65 clusters. They represent at best similar relationships between sequence and local structure properties of TMPs. Interestingly, HPM distinguishes among the resulting clusters two helical regions with distinct hydrophobic patterns. This underlines the complexity of the topology of these proteins. The HPM classification enlightens unusual relationship between amino acids in TMP fragments, which can be useful to elaborate new amino acids substitution matrices. Finally, two challenging applications are described: the first one aims at annotating protein functions (channel or not), the second one intends to assess the quality of the structures (X-ray or models) via a new scoring function deduced from the HPM classification.
引用
收藏
页码:2303 / 2322
页数:20
相关论文
共 42 条
  • [1] Sequence–structure relationship study in all-α transmembrane proteins using an unsupervised learning approach
    Jérémy Esque
    Aurélie Urbain
    Catherine Etchebest
    Alexandre G. de Brevern
    Amino Acids, 2015, 47 : 2303 - 2322
  • [2] Prediction of local structure in proteins using a library of sequence-structure motifs
    Bystroff, C
    Baker, D
    JOURNAL OF MOLECULAR BIOLOGY, 1998, 281 (03) : 565 - 577
  • [3] Extension of a local backbone description using a structural alphabet:: A new approach to the sequence-structure relationship
    de Brevern, AG
    Valadié, H
    Hazout, S
    Etchebest, C
    PROTEIN SCIENCE, 2002, 11 (12) : 2871 - 2886
  • [4] Multisequence algorithm for coarse-grained biomolecular simulations: Exploring the sequence-structure relationship of proteins
    Aina, A.
    Wallin, S.
    JOURNAL OF CHEMICAL PHYSICS, 2017, 147 (09):
  • [5] Local descriptors of protein structure: A systematic analysis of the sequence-structure relationship in proteins using short- and long-range interactions
    Hvidsten, Torgeir R.
    Kryshtafovych, Andriy
    Fidelis, Krzysztof
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2009, 75 (04) : 870 - 884
  • [6] Learning to paraphrase: An unsupervised approach using multiple-sequence alignment
    Barzilay, R
    Lee, L
    HLT-NAACL 2003: HUMAN LANGUAGE TECHNOLOGY CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE MAIN CONFERENCE, 2003, : 16 - 23
  • [7] Comparative modeling in CASP6 using consensus approach to template selection, sequence-structure alignment, and structure assessment
    Venclovas, C
    Margelevicius, M
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2005, 61 : 99 - 105
  • [8] Functional divergence after gene duplication and sequence-structure relationship: A case study of G-protein alpha subunits
    Zheng, Ying
    Xu, Dongping
    Gu, Xun
    JOURNAL OF EXPERIMENTAL ZOOLOGY PART B-MOLECULAR AND DEVELOPMENTAL EVOLUTION, 2007, 308B (01) : 85 - 96
  • [9] A Signal Analysis Approach Applied to the Study of Sequence, Structure and Function of the Proteins
    Benigni, Romualdo
    Giuliani, Alessandro
    Zbilut, Joseph P.
    Ellis, Solo W.
    Allorge, Delphine
    CURRENT COMPUTER-AIDED DRUG DESIGN, 2006, 2 (02) : 189 - 201
  • [10] PROBING THE SEQUENCE-STRUCTURE-FUNCTION RELATIONSHIP (SSFR) IN PROTEINS USING NEURAL NETWORKS
    LIEBMAN, MN
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 1989, 197 : 29 - COMP