An Ensemble Classifiers for Improved Prediction of Native-Non-Native Protein-Protein Interaction

被引:0
|
作者
Pratiwi, Nor Kumalasari Caecar [1 ,2 ]
Tayara, Hilal [3 ]
Chong, Kil To [1 ,4 ]
机构
[1] Jeonbuk Natl Univ, Dept Elect & Informat Engn, Jeonju 54896, South Korea
[2] Telkom Univ, Dept Elect Engn, Bandung 40257, West Java, Indonesia
[3] Jeonbuk Natl Univ, Sch Int Engn & Sci, Jeonju 54896, South Korea
[4] Jeonbuk Natl Univ, Adv Elect & Informat Res Ctr, Jeonju 54896, South Korea
基金
新加坡国家研究基金会;
关键词
protein-protein interaction; machine learning; ensemble classifiers; drug discovery; computational biology; CLASSIFICATION; BINDING;
D O I
10.3390/ijms25115957
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
In this study, we present an innovative approach to improve the prediction of protein-protein interactions (PPIs) through the utilization of an ensemble classifier, specifically focusing on distinguishing between native and non-native interactions. Leveraging the strengths of various base models, including random forest, gradient boosting, extreme gradient boosting, and light gradient boosting, our ensemble classifier integrates these diverse predictions using a logistic regression meta-classifier. Our model was evaluated using a comprehensive dataset generated from molecular dynamics simulations. While the gains in AUC and other metrics might seem modest, they contribute to a model that is more robust, consistent, and adaptable. To assess the effectiveness of various approaches, we compared the performance of logistic regression to four baseline models. Our results indicate that logistic regression consistently underperforms across all evaluated metrics. This suggests that it may not be well-suited to capture the complex relationships within this dataset. Tree-based models, on the other hand, appear to be more effective for problems involving molecular dynamics simulations. Extreme gradient boosting (XGBoost) and light gradient boosting (LightGBM) are optimized for performance and speed, handling datasets effectively and incorporating regularizations to avoid over-fitting. Our findings indicate that the ensemble method enhances the predictive capability of PPIs, offering a promising tool for computational biology and drug discovery by accurately identifying potential interaction sites and facilitating the understanding of complex protein functions within biological systems.
引用
收藏
页数:19
相关论文
共 50 条
  • [41] Engineering cell signaling modulators from native protein-protein interactions
    Zhang, Wei
    Ben-David, Moshe
    Sidhu, Sachdev S.
    CURRENT OPINION IN STRUCTURAL BIOLOGY, 2017, 45 : 25 - 35
  • [42] Feature-based classification of native and non-native protein-protein interactions: Comparing supervised and semi-supervised learning approaches
    Zhao, Nan
    Pang, Bin
    Shyu, Chi-Ren
    Korkin, Dmitry
    PROTEOMICS, 2011, 11 (22) : 4321 - 4330
  • [43] Protein-Protein Interactions Affect Native State Stability in Crowded Environments
    van Giessen, Alan E.
    Macdonald, Bryanne
    McCarley, Shannon
    Noeen, Sundus
    Layouni, Rabeb
    BIOPHYSICAL JOURNAL, 2015, 108 (02) : 52A - 52A
  • [44] Liquid Native MALDI Mass Spectrometry for the Detection of Protein-Protein Complexes
    Beaufour, Martine
    Ginguene, David
    Le Meur, Remy
    Castaing, Bertrand
    Cadene, Martine
    JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 2018, 29 (10) : 1981 - 1994
  • [45] Protein Function Prediction by Clustering of Protein-Protein Interaction Network
    Cingovska, Ivana
    Bogojeska, Aleksandra
    Trivodaliev, Kire
    Kalajdziski, Slobodan
    ICT INNOVATIONS 2011, 2011, 150 : 39 - 49
  • [46] Prediction of protein function using protein-protein interaction data
    Deng, MH
    Zhang, K
    Mehta, S
    Chen, T
    Sun, FZ
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2003, 10 (06) : 947 - 960
  • [47] Prediction of protein function using protein-protein interaction data
    Deng, MH
    Zhang, K
    Mehta, S
    Chen, T
    Sun, FZ
    CSB2002: IEEE COMPUTER SOCIETY BIOINFORMATICS CONFERENCE, 2002, : 197 - 206
  • [48] An Improved Genetic with Particle Swarm Optimization Algorithm Based on Ensemble Classification to Predict Protein-Protein Interaction
    Lakshmi, P.
    Ramyachitra, D.
    WIRELESS PERSONAL COMMUNICATIONS, 2020, 113 (04) : 1851 - 1870
  • [49] Protein Fold Prediction Problem Using Ensemble of Classifiers
    Dehzangi, Abdollah
    Amnuaisuk, Somnuk Phon
    Ng, Keng Hoong
    Mohandesi, Ehsan
    NEURAL INFORMATION PROCESSING, PT 2, PROCEEDINGS, 2009, 5864 : 503 - 511
  • [50] The Intrinsic Geometric Structure of Protein-Protein Interaction Networks for Protein Interaction Prediction
    Fang, Yi
    Sun, Mengtian
    Dai, Guoxian
    Ramain, Karthik
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2016, 13 (01) : 76 - 85