Bioinformatical studies suggest that additional information provided by nucleic acids is necessary to construct protein three-dimensional structures. We find underlying correlations between the contents of bases. All correlations occur at the third codon position of a gene sequence. Four inverse relationships are observed between u3 and c3, between a3 and g3, between u3 and g3, and between c3 and a3; and two positive relationships are apparent between u3 and a3, and between c3 and g3. Their correlation coefficients reach -0.92, -0.89, -0.83, -0.85, 0.83, and 0.66, respectively, for large proteins with multistate folding kinetics. The interconnection of bases can be ascribed to choice of synonymous codons associated with protein folding in vivo. In this study, the refolding rate constants of large proteins correlate with the contents of the third base, suggesting that there is underlying biochemical rationale of guiding protein folding in choosing synonymous codons. Proteins 2012. (c) 2012 Wiley Periodicals, Inc.