RELATIONS BETWEEN PROTEIN-SEQUENCE AND STRUCTURE AND THEIR SIGNIFICANCE

被引:46
|
作者
ROOMAN, MJ
RODRIGUEZ, J
WODAK, SJ
机构
[1] Unité de Conformation des Macromolécules Biologiques Université Libre de Bruxelles, CP160, P2
关键词
D O I
10.1016/S0022-2836(05)80195-0
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The relation between amino acid sequence and local structure in proteins is investigated. The local structures considered are either the four classes of secondary structure (H, E, T and C) or four classes of local conformations defined using measures of conformational similarity based on distances between Cα atoms. The classes are obtained by applying an automatic clustering procedure to short polypeptide fragments of uniform length from a database of 75 known protein structures. The thrust of our investigation consists of systematically searching the database for simple amino acid patterns of the type Gly-X-Ala-X-X-Val, where X denotes an arbitrary residue. Patterns that are nearly always associated with the same structure are retained. Finding many such associations, we then evaluate by a statistical approach how many among them are non-random and compare the results for different definitions of local structure. A similar comparison is made for the predictive value of retained associations, which is assessed using an internal test based on dividing the database into "learning" and "test" subsets. While we find that local structures defined by conformational similarity are not superior to secondary structure for prediction purposes, they help us gain insight into the factors that influence the predictive value of derived associations. A major conclusion is that the number of retained associations is in large excess over the number expected from a random correlation between sequence and structure, irrespective of how local conformation is defined. However, only a very small number of these associations can be earmarked as reliable using statistical criteria, due to the limited size of the database. We find, for instance, that the pattern Ala-Ala-X-X-Lys reliably characterizes helix, and the pattern Val-X-Val-X-X-X-Ala reliably characterizes extended structure and β-strand. The possibility is discussed that these and other reliable associations correspond to regions of the polypeptide chain whose conformations are locally determined and that these regions may play a role in folding. © 1990 Academic Press Limited.
引用
收藏
页码:337 / 350
页数:14
相关论文
共 50 条
  • [21] A KNOWLEDGE-BASED ARCHITECTURE FOR PROTEIN-SEQUENCE ANALYSIS AND STRUCTURE PREDICTION
    CLARK, DA
    BARTON, GJ
    RAWLINGS, CJ
    JOURNAL OF MOLECULAR GRAPHICS, 1990, 8 (02): : 94 - 107
  • [22] INFERRING CORRELATION BETWEEN DATABASE QUERIES - ANALYSIS OF PROTEIN-SEQUENCE PATTERNS
    GUIGO, R
    SMITH, TF
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1993, 15 (10) : 1030 - 1041
  • [23] SEARCHING GENE AND PROTEIN-SEQUENCE DATABASES
    BARSALOU, T
    BRUTLAG, DL
    M D COMPUTING, 1991, 8 (03): : 144 - 149
  • [24] PROTEIN-SEQUENCE ANALYSIS OF CARCINOEMBRYONIC ANTIGEN
    PAXTON, RJ
    MOOSER, G
    THOMPSON, J
    SHIVELY, JE
    TUMOUR BIOLOGY, 1986, 7 (04): : 270 - 270
  • [25] CURRENT STRATEGIES IN PROTEIN-SEQUENCE ANALYSIS
    HEINRIKSON, RL
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 1986, 192 : 57 - ANYL
  • [26] TRENDS IN AUTOMATED PROTEIN-SEQUENCE ANALYSIS
    KLAPPER, DG
    TRAC-TRENDS IN ANALYTICAL CHEMISTRY, 1983, 2 (12) : 267 - 269
  • [27] THE EVOLUTION AND RECOGNITION OF PROTEIN-SEQUENCE REPEATS
    HERINGA, J
    COMPUTERS & CHEMISTRY, 1994, 18 (03): : 233 - 243
  • [28] UNDERLYING ORDER IN PROTEIN-SEQUENCE ORGANIZATION
    BERMAN, AL
    KOLKER, E
    TRIFONOV, EN
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1994, 91 (09) : 4044 - 4047
  • [29] 3-DIMENSIONAL PROFILES FOR ANALYZING PROTEIN-SEQUENCE STRUCTURE RELATIONSHIPS
    EISENBERG, D
    BOWIE, JU
    LUTHY, R
    CHOE, S
    FARADAY DISCUSSIONS, 1992, 93 : 25 - 34
  • [30] RECENT ADVANCES IN THE PROTEIN-SEQUENCE ANALYSIS
    TSUNASAWA, S
    SEIKAGAKU, 1985, 57 (06): : 472 - 480