Principal component analysis in protein tertiary structure prediction

被引:3
|
作者
Alvarez, Oscar [1 ]
Luis Fernandez-Martinez, Juan [1 ]
Fernandez-Brillet, Celia [1 ]
Cernea, Ana [1 ]
Fernandez-Muniz, Zulima [1 ]
Kloczkowski, Andrzej [2 ,3 ]
机构
[1] Univ Oviedo, Dept Math, Grp Inverse Problems Optimizat & Machine Learning, C Federico Garcia Lorca 18, Oviedo 33007, Spain
[2] Nationwide Childrens Hosp, Battelle Ctr Math Med, Columbus, OH USA
[3] Ohio State Univ, Dept Pediat, Columbus, OH 43210 USA
关键词
Principal component analysis; particle swarm optimization; tertiary protein structure; conformational sampling; protein structure refinement; REFINEMENT; FRAGMENTS; SEQUENCES; FOLD;
D O I
10.1142/S0219720018500051
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
We discuss applicability of principal component analysis (PCA) for protein tertiary structure prediction from amino acid sequence. The algorithm presented in this paper belongs to the category of protein refinement models and involves establishing a low-dimensional space where the sampling (and optimization) is carried out via particle swarm optimizer (PSO). The reduced space is found via PCA performed for a set of low-energy protein models previously found using different optimization techniques. A high frequency term is added into this expansion by projecting the best decoy into the PCA basis set and calculating the residual model. This term is aimed at providing high frequency details in the energy optimization. The goal of this research is to analyze how the dimensionality reduction affects the prediction capability of the PSO procedure. For that purpose, different proteins from the Critical Assessment of Techniques for Protein Structure Prediction experiments were modeled. In all the cases, both the energy of the best decoy and the distance to the native structure have decreased. Our analysis also shows how the predicted backbone structure of native conformation and of alternative low energy states varies with respect to the PCA dimensionality. Generally speaking, the reconstruction can be successfully achieved with 10 principal components and the high frequency term. We also provide a computational analysis of protein energy landscape for the inverse problem of reconstructing structure from the reduced number of principal components, showing that the dimensionality reduction alleviates the ill-posed character of this high-dimensional energy optimization problem. The procedure explained in this paper is very fast and allows testing different PCA expansions. Our results show that PSO improves the energy of the best decoy used in the PCA when the adequate number of PCA terms is considered.
引用
收藏
页数:34
相关论文
共 50 条
  • [1] On the Use of Principal Component Analysis and Particle Swarm Optimization in Protein Tertiary Structure Prediction
    Alvarez, Oscar
    Fernandez-Martinez, Juan Luis
    Fernandez-Brillet, Celia
    Cernea, Ana
    Fernandez-Muniz, Zulima
    Kloczkowski, Andrzej
    [J]. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING (ICAISC 2018), PT II, 2018, 10842 : 107 - 116
  • [2] TERTIARY STRUCTURE IS A PRINCIPAL DETERMINANT TO PROTEIN DEAMIDATION
    KOSSIAKOFF, AA
    [J]. SCIENCE, 1988, 240 (4849) : 191 - 194
  • [3] Prediction of Protein Subcellular Localization by Using λ-Order Factor and Principal Component Analysis
    Zhang, Shengli
    Jin, Jin
    [J]. LETTERS IN ORGANIC CHEMISTRY, 2017, 14 (09) : 717 - 724
  • [4] Prediction of Power Consumption of the Tertiary Industry based on Principal Component Regression
    Wang, Yanhui
    [J]. SEVENTH WUHAN INTERNATIONAL CONFERENCE ON E-BUSINESS, VOLS I-III: UNLOCKING THE FULL POTENTIAL OF GLOBAL TECHNOLOGY, 2008, : 701 - 704
  • [5] Prediction of Stock Market by Principal Component Analysis
    Waqar, Muhammad
    Dawood, Hassan
    Shahnawaz, Muhammad Bilal
    Ghazanfar, Mustansar Ali
    Guo, Ping
    [J]. 2017 13TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS), 2017, : 599 - 602
  • [6] Color component prediction based on rotated principal component analysis
    Xu, Fa-Qiang
    Wan, Xiao-Xia
    Zhu, Yuan-Hong
    [J]. Guangxue Jingmi Gongcheng/Optics and Precision Engineering, 2008, 16 (03): : 518 - 523
  • [7] Advances in protein tertiary structure prediction
    Farhadi, Tayebeh
    [J]. BIOMEDICAL AND BIOTECHNOLOGY RESEARCH JOURNAL, 2018, 2 (01): : 20 - 25
  • [8] Robust principal component analysis-based prediction of protein-protein interaction hot spots
    Sitani, Divya
    Giorgetti, Alejandro
    Alfonso-Prieto, Mercedes
    Carloni, Paolo
    [J]. PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2021, 89 (06) : 639 - 647
  • [9] Principal Component Analysis for Protein Folding Dynamics
    Maisuradze, Gia G.
    Liwo, Adam
    Scheraga, Harold A.
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 2009, 385 (01) : 312 - 329
  • [10] Pattern analysis and prediction of O-linked glycosylation sites in protein by principal component subspace analysis
    Chen, Yen-Wei
    Yang, Xuemei
    Ito, Masahiro
    Nishikawa, Ikuko
    [J]. KNOWLEDGE-BASED INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS: KES 2007 - WIRN 2007, PT II, PROCEEDINGS, 2007, 4693 : 326 - 334