An Ensemble Classifiers for Improved Prediction of Native-Non-Native Protein-Protein Interaction

被引:0
|
作者
Pratiwi, Nor Kumalasari Caecar [1 ,2 ]
Tayara, Hilal [3 ]
Chong, Kil To [1 ,4 ]
机构
[1] Jeonbuk Natl Univ, Dept Elect & Informat Engn, Jeonju 54896, South Korea
[2] Telkom Univ, Dept Elect Engn, Bandung 40257, West Java, Indonesia
[3] Jeonbuk Natl Univ, Sch Int Engn & Sci, Jeonju 54896, South Korea
[4] Jeonbuk Natl Univ, Adv Elect & Informat Res Ctr, Jeonju 54896, South Korea
基金
新加坡国家研究基金会;
关键词
protein-protein interaction; machine learning; ensemble classifiers; drug discovery; computational biology; CLASSIFICATION; BINDING;
D O I
10.3390/ijms25115957
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
In this study, we present an innovative approach to improve the prediction of protein-protein interactions (PPIs) through the utilization of an ensemble classifier, specifically focusing on distinguishing between native and non-native interactions. Leveraging the strengths of various base models, including random forest, gradient boosting, extreme gradient boosting, and light gradient boosting, our ensemble classifier integrates these diverse predictions using a logistic regression meta-classifier. Our model was evaluated using a comprehensive dataset generated from molecular dynamics simulations. While the gains in AUC and other metrics might seem modest, they contribute to a model that is more robust, consistent, and adaptable. To assess the effectiveness of various approaches, we compared the performance of logistic regression to four baseline models. Our results indicate that logistic regression consistently underperforms across all evaluated metrics. This suggests that it may not be well-suited to capture the complex relationships within this dataset. Tree-based models, on the other hand, appear to be more effective for problems involving molecular dynamics simulations. Extreme gradient boosting (XGBoost) and light gradient boosting (LightGBM) are optimized for performance and speed, handling datasets effectively and incorporating regularizations to avoid over-fitting. Our findings indicate that the ensemble method enhances the predictive capability of PPIs, offering a promising tool for computational biology and drug discovery by accurately identifying potential interaction sites and facilitating the understanding of complex protein functions within biological systems.
引用
收藏
页数:19
相关论文
共 50 条
  • [31] NOXclass: Prediction of protein-protein interaction types
    Max-Planck-Institut für Informatik, Stuhlsatzenhausweg 85, 66123 Saarbrücken, Germany
    BMC Bioinform., 2006,
  • [32] NOXclass: prediction of protein-protein interaction types
    Hongbo Zhu
    Francisco S Domingues
    Ingolf Sommer
    Thomas Lengauer
    BMC Bioinformatics, 7 (1)
  • [33] Construction and prediction of protein-protein interaction maps
    Schächter, V
    BIOINFORMATICS AND GENOME ANALYSIS, 2002, 38 : 191 - 220
  • [34] Protein-Protein Interaction: Prediction, Design, and Modulation
    Zhang Chang-Sheng
    Lai Lu-Hua
    ACTA PHYSICO-CHIMICA SINICA, 2012, 28 (10) : 2363 - 2380
  • [35] Protein-Protein Interaction Prediction: Recent Advances
    Shatnawi, Maad
    2017 28TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA), 2017, : 69 - 73
  • [36] Protein-protein interaction sites prediction by ensemble random forests with synthetic minority oversampling technique
    Wang, Xiaoying
    Yu, Bin
    Ma, Anjun
    Chen, Cheng
    Liu, Bingqiang
    Ma, Qin
    BIOINFORMATICS, 2019, 35 (14) : 2395 - 2402
  • [37] Prediction of protein-protein binding hot spots: A combination of classifiers approach
    Higa, Roberto Hiroshi
    Tozzi, Clesio Luis
    ADVANCES IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, PROCEEDINGS, 2008, 5167 : 165 - +
  • [38] Hot spot prediction in protein-protein interactions by an ensemble system
    Liu, Quanya
    Chen, Peng
    Wang, Bing
    Zhang, Jun
    Li, Jinyan
    BMC SYSTEMS BIOLOGY, 2018, 12
  • [39] Phonetic Accommodation on the Segmental and the Suprasegmental Level of Speech in Native-Non-Native Collaborative Tasks
    Ulbrich, Christiane
    LANGUAGE AND SPEECH, 2024, 67 (02) : 346 - 372
  • [40] The native-non-native dichotomy in minority language contexts Comparisons between Irish and Galician
    O'Rourke, Bernadette
    Ramallo, Fernando
    LANGUAGE PROBLEMS & LANGUAGE PLANNING, 2011, 35 (02): : 139 - 159