Prediction of the Bonding State of Cysteine Residues in Proteins with Machine-Learning Methods

被引:0
|
作者
Savojardo, Castrense [1 ]
Fariselli, Piero [1 ]
Martelli, Pier Luigi [1 ]
Shukla, Priyank [1 ]
Casadio, Rita [1 ]
机构
[1] Univ Bologna, Biocomp Grp, I-40126 Bologna, Italy
关键词
Machine Learning; Conditional Random Fields; Disulfide Prediction; Disulfide Bonding State; Protein Structure Prediction; Protein Folding;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper we evaluate the performance of machine learning methods in the task of predicting the bonding state of cysteines starting from protein sequences. This task is the first step for the identification of disulfide bonds in proteins. We score the performance of three different approaches: 1) Hidden Support Vector Machines (HSVMs) which integrate the SVM predictions with a Hidden Markov Model; 2) SVM-HMMs which discriminatively train models that are isomorphic to a kth-order hidden Markov model; 3) Grammatical-Restrained Hidden Conditional Random Fields (GRHCRFs) that we recently introduced. We evaluate two different encoding schemes based on sequence profile and position specific scoring matrix (PSSM) as computed with the PSI-BLAST program and we show that when the evolutionary information is encoded with PSSM all the methods perform better than with sequence profile. Among the different methods it appears that GRHCRFs perform slightly better than the others achieving a per protein accuracy of 87% with a Matthews correlation coefficient (C) of 0.73. Finally, we investigate the difference between disulfide bonding state predictions in Eukaryotes and Prokaryotes. Our analysis shows that the per-protein accuracy in Prokaryotic proteins is higher than that in Eukaryotes (0.88 vs 0.83). However; given the paucity of bonded cysteines in Prokaryotes as compared to Eukaryotes the Matthews correlation coefficient is drastically reduced (0.48 vs 0.80).
引用
收藏
页码:98 / 111
页数:14
相关论文
共 50 条
  • [21] CONSTITUTIONAL RIGHTS IN THE MACHINE-LEARNING STATE
    Huq, Aziz Z.
    [J]. CORNELL LAW REVIEW, 2020, 105 (07) : 1875 - 1953
  • [22] Evaluation of Methods for the Calculation of the pKa of Cysteine Residues in Proteins
    Awoonor-Williams, Ernest
    Rowley, Christopher N.
    [J]. JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2016, 12 (09) : 4662 - 4673
  • [23] Prediction of Phage Virion Proteins Using Machine Learning Methods
    Barman, Ranjan Kumar
    Chakrabarti, Alok Kumar
    Dutta, Shanta
    [J]. MOLECULES, 2023, 28 (05):
  • [24] Groundwater Prediction Using Machine-Learning Tools
    Hussein, Eslam A.
    Thron, Christopher
    Ghaziasgar, Mehrdad
    Bagula, Antoine
    Vaccari, Mattia
    [J]. ALGORITHMS, 2020, 13 (11)
  • [25] Advancing interpretability of machine-learning prediction models
    Trenary, Laurie
    DelSole, Timothy
    [J]. ENVIRONMENTAL DATA SCIENCE, 2022, 1
  • [26] A machine-learning algorithm for wind gust prediction
    Sallis, P. J.
    Claster, W.
    Hernandez, S.
    [J]. COMPUTERS & GEOSCIENCES, 2011, 37 (09) : 1337 - 1344
  • [27] Anxiety onset in adolescents: a machine-learning prediction
    Alice V. Chavanne
    Marie Laure Paillère Martinot
    Jani Penttilä
    Yvonne Grimmer
    Patricia Conrod
    Argyris Stringaris
    Betteke van Noort
    Corinna Isensee
    Andreas Becker
    Tobias Banaschewski
    Arun L. W. Bokde
    Sylvane Desrivières
    Herta Flor
    Antoine Grigis
    Hugh Garavan
    Penny Gowland
    Andreas Heinz
    Rüdiger Brühl
    Frauke Nees
    Dimitri Papadopoulos Orfanos
    Tomáš Paus
    Luise Poustka
    Sarah Hohmann
    Sabina Millenet
    Juliane H. Fröhner
    Michael N. Smolka
    Henrik Walter
    Robert Whelan
    Gunter Schumann
    Jean-Luc Martinot
    Eric Artiges
    [J]. Molecular Psychiatry, 2023, 28 : 639 - 646
  • [28] Anxiety onset in adolescents: a machine-learning prediction
    Chavanne, Alice
    Paillere Martinot, Marie Laure
    Penttilae, Jani
    Grimmer, Yvonne
    Conrod, Patricia
    Stringaris, Argyris
    van Noort, Betteke
    Isensee, Corinna
    Becker, Andreas
    Banaschewski, Tobias
    Bokde, Arun L. W.
    Desrivieres, Sylvane
    Flor, Herta
    Grigis, Antoine
    Garavan, Hugh
    Gowland, Penny
    Heinz, Andreas
    Bruehl, Ruediger
    Nees, Frauke
    Orfanos, Dimitri Papadopoulos
    Paus, Tomas
    Poustka, Luise
    Hohmann, Sarah S.
    Millenet, Sabina
    Froehner, Juliane
    Smolka, Michael
    Walter, Henrik
    Whelan, Robert
    Schumann, Gunter
    Martinot, Jean-Luc
    Artiges, Eric
    [J]. MOLECULAR PSYCHIATRY, 2023, 28 (02) : 639 - 646
  • [29] A Survey for Predicting ATP Binding Residues of Proteins Using Machine Learning Methods
    Yang, Yu-He
    Wang, Jia-Shu
    Yuan, Shi-Shi
    Liu, Meng-Lu
    Su, Wei
    Lin, Hao
    Zhang, Zhao-Yue
    [J]. CURRENT MEDICINAL CHEMISTRY, 2022, 29 (05) : 789 - 806
  • [30] Methods for Automatic Machine-Learning Workflow Analysis
    Wendlinger, Lorenz
    Berndl, Emanuel
    Granitzer, Michael
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021: APPLIED DATA SCIENCE TRACK, PT V, 2021, 12979 : 52 - 67