IDENTIFICATION OF IMPORTANT FUNCTIONAL ENVIRONS IN PROTEIN TERTIARY STRUCTURES FROM THE ANALYSIS OF RESIDUE VARIATION IN 3-D - APPLICATION TO CYTOCHROMES-C AND CARBOXYPEPTIDASE-A AND CARBOXYPEPTIDASE-B
A simple methodology is described to apply to aligned protein sequence sets for which at least one representative 3-D C-alpha structure is known. The evolutionary variation observed at each residue position in the sequence alignment is qualified by taking into account the residue variation that has occurred at other positions located within 7 Angstrom (according to the probable chain fold). This expresses the evolutionary behaviour of any residue position in the more appropriate context of its immediate surroundings and distinguishes between invariant residues on the basis of the variation of their environment. The highest mechanistic significance is attached to conserved residues in conserved surroundings, but the quantitative nature of the analysis means that all residue vicinities can be ranked and merged according to the degree of conservation that they exhibit and the residue positions that comprise them. Therefore, with the aid of the chain fold, contour maps can be constructed that show graded foci of evolutionary conservation in the underlying superstructure of the protein type, and the irregular shapes and extents of large conserved areas. To test the methodology, it was applied to cytochromes c and the carboxypeptidases A and B.