Two-stage support vector regression approach for predicting accessible surface areas of amino acids

被引:46
|
作者
Nguyen, MN
Rajapakse, JC [1 ]
机构
[1] Nanyang Technol Univ, Sch Comp Engn, Bioinformat Res Ctr, Singapore 639798, Singapore
[2] MIT, Biol Engn Div, Cambridge, MA USA
关键词
protein structure prediction; accessible surface area; solvent accessibility; support vector regression; PSI-BLAST;
D O I
10.1002/prot.20883
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We address the problem of predicting solvent accessible surface area (ASA) of amino acid residues in protein sequences, without classifying them into buried and exposed types. A two-stage support vector regression (SVR) approach is proposed to predict real values of ASA from the position-specific scoring matrices generated from PSI-BIAST profiles. By adding SVR as the second stage to capture the influences on the ASA value of a residue by those of its neighbors, the two-stage SVR approach achieves improvements of mean absolute errors up to 3.3%, and correlation coefficients of 0.66, 0.68, and 0.67 on the Manesh dataset of 215 proteins, the Barton dataset of 502 nonhomologous proteins, and the Carugo dataset of 338 proteins, respectively, which are better than the scores published earlier on these datasets. A Web server for protein ASA prediction by using a two-stage SVR method has been developed and is available (http:// bire.ntu.edu.sg/similar to pas0186457/asa.html).
引用
收藏
页码:542 / 550
页数:9
相关论文
共 50 条
  • [41] A two-stage support-vector-regression optimization model for municipal solid waste management - A case study of Beijing, China
    Dai, C.
    Li, Y. P.
    Huang, G. H.
    JOURNAL OF ENVIRONMENTAL MANAGEMENT, 2011, 92 (12) : 3023 - 3037
  • [42] Loss given default estimation: a two-stage model with classification tree-based boosting and support vector logistic regression
    Tanoue, Yuta
    Yamashita, Satoshi
    JOURNAL OF RISK, 2019, 21 (04): : 19 - 37
  • [43] Monthly Runoff forecasting using A Climate-driven Model Based on Two-stage Decomposition and Optimized Support Vector Regression
    Jia, Zhuo
    Peng, Yuhao
    Li, Qin
    Xiao, Rui
    Chen, Xue
    Cheng, Zhijin
    WATER RESOURCES MANAGEMENT, 2024, 38 (14) : 5701 - 5722
  • [44] LogP Prediction for Blocked Tripeptides with Amino Acids Descriptors (HMLP) by Multiple Linear Regression and Support Vector Regression
    Yin, Jiajian
    2011 INTERNATIONAL CONFERENCE ON ENVIRONMENT SCIENCE AND BIOTECHNOLOGY (ICESB 2011), 2011, 8 : 173 - 178
  • [45] A Two-Stage Response Surface Approach to Modeling Drug Interaction
    Zhao, Wei
    Zhang, Lanju
    Zeng, Lingmin
    Yang, Harry
    STATISTICS IN BIOPHARMACEUTICAL RESEARCH, 2012, 4 (04): : 375 - 383
  • [46] Support vector regression with feature selection for the multivariate calibration of spectrofluorimetric determination of aromatic amino acids
    Li, Guo-Zheng
    Meng, Hao-Hua
    Yang, Mary Qu
    Yang, Jack Y.
    PROCEEDINGS OF THE 7TH IEEE INTERNATIONAL SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING, VOLS I AND II, 2007, : 842 - +
  • [47] A Bayesian two-stage regression approach of analysing longitudinal outcomes with endogeneity and incompleteness
    Bhuyan, Prajamitra
    Biswas, Jayabrata
    Ghosh, Pulak
    Das, Kiranmoy
    STATISTICAL MODELLING, 2019, 19 (02) : 157 - 173
  • [48] A two-stage decision-support system for floating debris collection in reservoir areas
    Gao, Pan
    Du, Wangmiao
    Yu, Hao
    Zhao, Xu
    COMPUTERS & INDUSTRIAL ENGINEERING, 2023, 185
  • [49] A two-stage classification method for borehole-wall images with support vector machine
    Deng, Zhaopeng
    Cao, Maoyong
    Rai, Laxmisha
    Gao, Wei
    PLOS ONE, 2018, 13 (06):
  • [50] Automatic spike detection in EEG by a two-stage procedure based on support vector machines
    Acir, N
    Güzelis, C
    COMPUTERS IN BIOLOGY AND MEDICINE, 2004, 34 (07) : 561 - 575