Sequence-structure relationship study in all-α transmembrane proteins using an unsupervised learning approach

被引:7
|
作者
Esque, Jeremy [1 ,2 ,3 ,4 ,5 ,6 ,7 ,8 ]
Urbain, Aurelie [9 ]
Etchebest, Catherine [1 ,2 ,3 ,4 ]
de Brevern, Alexandre G. [1 ,2 ,3 ,4 ]
机构
[1] INSERM, DSIMB, U 1134, F-75739 Paris, France
[2] Univ Paris Diderot, Sorbonne Paris Cite, UMR S 1134, F-75739 Paris, France
[3] INTS, F-75739 Paris, France
[4] Lab Excellence GR Ex, F-75739 Paris, France
[5] ISIS, Lab Ingn Fonct Mol IFM, UMR 7006, F-67000 Strasbourg, France
[6] INSERM, IGBMC, Dept Integrat Struct Biol, U964, F-67404 Illkirch Graffenstaden, France
[7] CNRS, UMR7104, F-67404 Illkirch Graffenstaden, France
[8] Univ Strasbourg, F-67404 Illkirch Graffenstaden, France
[9] INRA, Inst Jean Pierre Bourgin, UMR 1318, F-78026 Versailles, France
关键词
Transmembrane protein; Learning approach; Sequence-structure relationship; Protein structure; Artificial neural network; Hybrid protein model; Structural alphabet; Classification; PREDICTING RELIABLE REGIONS; ABC TRANSPORTER MSBA; AMINO-ACID; MULTIPLE ALIGNMENT; MEMBRANE-PROTEINS; MOTIFS; DATABASE; HELICES; MATRIX; TM;
D O I
10.1007/s00726-015-2010-5
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Transmembrane proteins (TMPs) are major drug targets, but the knowledge of their precise topology structure remains highly limited compared with globular proteins. In spite of the difficulties in obtaining their structures, an important effort has been made these last years to increase their number from an experimental and computational point of view. In view of this emerging challenge, the development of computational methods to extract knowledge from these data is crucial for the better understanding of their functions and in improving the quality of structural models. Here, we revisit an efficient unsupervised learning procedure, called Hybrid Protein Model (HPM), which is applied to the analysis of transmembrane proteins belonging to the all-alpha structural class. HPM method is an original classification procedure that efficiently combines sequence and structure learning. The procedure was initially applied to the analysis of globular proteins. In the present case, HPM classifies a set of overlapping protein fragments, extracted from a non-redundant databank of TMP 3D structure. After fine-tuning of the learning parameters, the optimal classification results in 65 clusters. They represent at best similar relationships between sequence and local structure properties of TMPs. Interestingly, HPM distinguishes among the resulting clusters two helical regions with distinct hydrophobic patterns. This underlines the complexity of the topology of these proteins. The HPM classification enlightens unusual relationship between amino acids in TMP fragments, which can be useful to elaborate new amino acids substitution matrices. Finally, two challenging applications are described: the first one aims at annotating protein functions (channel or not), the second one intends to assess the quality of the structures (X-ray or models) via a new scoring function deduced from the HPM classification.
引用
收藏
页码:2303 / 2322
页数:20
相关论文
共 42 条