Predicting protein secondary structure by an ensemble through feature-based accuracy estimation

被引:0
|
作者
Krieger, Spencer [1 ]
Kececioglu, John [1 ]
机构
[1] Univ Arizona, Comp Sci, Tucson, AZ 85721 USA
基金
美国国家科学基金会;
关键词
Protein secondary structure prediction; ensemble methods; feature-based accuracy estimation; method hybridization; NEURAL NETWORKS;
D O I
10.1145/3388440.3412425
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Protein secondary structure prediction is a fundamental task in computational biology, basic to many bioinformatics workflows, with a diverse collection of tools currently available. An approach from machine learning with the potential to capitalize on such a collection is ensemble prediction, which runs multiple predictors and combines their predictions into one, output by the ensemble. We conduct a thorough study of seven different approaches to ensemble secondary structure prediction, several of which are novel, and show we can indeed obtain an ensemble method that significantly exceeds the accuracy of individual state-of-the-art tools. The best approaches build on a recent technique known as feature-based accuracy estimation, which estimates the unknown true accuracy of a prediction, here using features of both the prediction output and the internal state of the prediction method. In particular, a hybrid approach to ensemble prediction that leverages accuracy estimation is now the most accurate method currently available: on average over standard CASP and PDB benchmarks, it exceeds the state-of-the-artQ3 accuracy for 3-state prediction by nearly 4%, and exceeds the Q8 accuracy for 8-state prediction by more than 8%. A preliminary implementation of our approach to ensemble protein secondary structure prediction, in a new tool we call Ssylla, is available free for non-commercial use at ssylla.cs.arizona.edu.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Feature-based registration of medical images: Estimation and validation of the pose accuracy
    Pennec, X
    Guttmann, CRG
    Thirion, JP
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION - MICCAI'98, 1998, 1496 : 1107 - 1114
  • [2] Protein encoder: An autoencoder-based ensemble feature selection scheme to predict protein secondary structure
    Uzma
    Manzoor, Usama
    Halim, Zahid
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2023, 213
  • [3] Genetic Algorithm Feature-Based Resampling for Protein Structure Prediction
    Higgs, Trent
    Stantic, Bela
    Hoque, Md Tamjidul
    Sattar, Abdul
    [J]. 2010 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2010,
  • [4] Estimation of uncertainty in feature-based metrology
    Takamasu, K
    Abbe, M
    Furutani, R
    Ozono, S
    [J]. ISMTII'2001: PROCEEDINGS OF THE FIFTH INTERNATIONAL SYMPOSIUM ON MEASUREMENT TECHNOLOGY AND INTELLIGENT INSTRUMENTS, 2001, : 47 - 52
  • [5] Feature-Based and String-Based Models for Predicting RNA-Protein Interaction
    Adjeroh, Donald
    Allaga, Maen
    Tan, Jun
    Lin, Jie
    Jiang, Yue
    Abbasi, Ahmed
    Zhou, Xiaobo
    [J]. MOLECULES, 2018, 23 (03):
  • [6] BENEFITS OF GENETIC ALGORITHM FEATURE-BASED RESAMPLING FOR PROTEIN STRUCTURE PREDICTION
    Higgs, Trent
    Stantic, Bela
    Hoque, Tamjidul
    Sattar, Abdul
    [J]. BIOINFORMATICS: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON BIOINFORMATICS MODELS, METHODS AND ALGORITHMS, 2012, : 188 - 194
  • [7] The uncertainty estimation of feature-based forecast combinations
    Wang, Xiaoqian
    Kang, Yanfei
    Petropoulos, Fotios
    Li, Feng
    [J]. JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 2022, 73 (05) : 979 - 993
  • [8] Feature-based similarity estimation of construction subschedules
    Shapir, K.
    Koenig, M.
    [J]. EWORK AND EBUSINESS IN ARCHITECTURE, ENGINEERING AND CONSTRUCTION 2014, 2015, : 569 - 576
  • [9] Feature-based estimation of preliminary costs in shipbuilding
    Lin, Cheng-Kuan
    Shaw, Heiu-Jou
    [J]. OCEAN ENGINEERING, 2017, 144 : 305 - 319
  • [10] Feature-based estimation of steel weight in shipbuilding
    Lin, Cheng-Kuan
    Shaw, Heiu-Jou
    [J]. OCEAN ENGINEERING, 2015, 107 : 193 - 203