Conformational Sampling Spaces for Homology Modeling: A Missing Data Approach

被引:0
|
作者
Han, Rongsheng [1 ]
Wu, Guoqing [1 ]
Zhang, Meiling [2 ]
机构
[1] North China Elect Power Univ, 2 Beinong Rd, Beijing 102206, Peoples R China
[2] Tianjin Med Univ, Basic Med Coll, Tianjin 300070, Peoples R China
关键词
PROTEIN; ACCURACY;
D O I
暂无
中图分类号
TE [石油、天然气工业]; TK [能源与动力工程];
学科分类号
0807 ; 0820 ;
摘要
The distance weighted k-nearest neighbor (KNN) algorithm is proposed to impute missing values in the protein comparative modeling (CM). These missing values are caused by insertions/ deletions in the multiple structural alignments of the superfamily. Together with the principal component analysis (PCA) method and the anisotropic network model (ANM), evolutionary deformation information and topological information of proteins are extracted to help the construction of low dimensional sampling spaces for the conserved cores of amino acid backbones. Compared to the standard CM, our method utilizes more evolutionary information, and can study more core residues in the space. As a consequence, the sampling spaces are greatly enlarged. The qualities of sampling spaces are evaluated by distances between the query protein and the sampling space. The results of applications to a set of 33 representative and well studied superfamilies show that the accuracies of most of enlarged sampling spaces are below 1 angstrom. This implies that they are good candidates for further applications in protein structural researches.
引用
收藏
页码:544 / 549
页数:6
相关论文
共 50 条