DeepBP: Ensemble deep learning strategy for bioactive peptide prediction

被引：2

作者：

Zhang, Ming ^{[1
]}

Zhou, Jianren ^{[1
]}

Wang, Xiaohua ^{[1
]}

Wang, Xun ^{[1
]}

Ge, Fang ^{[2
,3
]}

机构：

[1] Jiangsu Univ Sci & Technol, Sch Comp, 666 Changhui Rd, Zhenjiang 212100, Peoples R China

[2] Nanjing Univ Posts & Telecommun, State Key Lab Organ Elect & Informat Displays, 9 Wenyuan Rd, Nanjing 210023, Peoples R China

[3] Nanjing Univ Posts & Telecommun, Inst Adv Mat IAM, 9 Wenyuan Rd, Nanjing 210023, Peoples R China

来源：

BMC BIOINFORMATICS | 2024年 / 25卷 / 01期

关键词：

ACE inhibitory peptides; Anticancer peptides; Protein language model; Gated recurrent unit; Generative adversarial capsule network; ATTENTION; NETWORKS; GRU; CNN;

D O I：

10.1186/s12859-024-05974-5

中图分类号：

Q5 [生物化学];

学科分类号：

071010 ; 081704 ;

摘要：

BackgroundBioactive peptides are important bioactive molecules composed of short-chain amino acids that play various crucial roles in the body, such as regulating physiological processes and promoting immune responses and antibacterial effects. Due to their significance, bioactive peptides have broad application potential in drug development, food science, and biotechnology. Among them, understanding their biological mechanisms will contribute to new ideas for drug discovery and disease treatment.ResultsThis study employs generative adversarial capsule networks (CapsuleGAN), gated recurrent units (GRU), and convolutional neural networks (CNN) as base classifiers to achieve ensemble learning through voting methods, which not only obtains high-precision prediction results on the angiotensin-converting enzyme (ACE) inhibitory peptides dataset and the anticancer peptides (ACP) dataset but also demonstrates effective model performance. For this method, we first utilized the protein language model-evolutionary scale modeling (ESM-2)-to extract relevant features for the ACE inhibitory peptides and ACP datasets. Following feature extraction, we trained three deep learning models-CapsuleGAN, GRU, and CNN-while continuously adjusting the model parameters throughout the training process. Finally, during the voting stage, different weights were assigned to the models based on their prediction accuracy, allowing full utilization of the model's performance. Experimental results show that on the ACE inhibitory peptide dataset, the balanced accuracy is 0.926, the Matthews correlation coefficient (MCC) is 0.831, and the area under the curve is 0.966; on the ACP dataset, the accuracy (ACC) is 0.779, and the MCC is 0.558. The experimental results on both datasets are superior to existing methods, demonstrating the effectiveness of the experimental approach.ConclusionIn this study, CapsuleGAN, GRU, and CNN were successfully employed as base classifiers to implement ensemble learning, which not only achieved good results in the prediction of two datasets but also surpassed existing methods. The ability to predict peptides with strong ACE inhibitory activity and ACPs more accurately and quickly is significant, and this work provides valuable insights for predicting other functional peptides. The source code and dataset for this experiment are publicly available at https://github.com/Zhou-Jianren/bioactive-peptides.

引用

页数：19

共 50 条

[41] Performance Prediction for Deep Learning Models With Pipeline Inference Strategy
Wang, Zhenyi
Yang, Pengfei
Zhang, Bowen
Hu, Linwei
Lv, Wenkai
Lin, Chengmin
Zhang, Cheng
Wang, Quan
IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (02) : 2964 - 2978
[42] Machine Learning and Deep Learning for Loan Prediction in Banking: Exploring Ensemble Methods and Data Balancing
Sayed, Eslam Hussein
Alabrah, Amerah
Rahouma, Kamel Hussein
Zohaib, Muhammad
Badry, Rasha M.
IEEE ACCESS, 2024, 12 : 193997 - 194019
[43] Deep learning versus parametric and ensemble methods for genomic prediction of complex phenotypes
Abdollahi-Arpanahi, Rostam L.
Gianola, Daniel
Penagaricano, Francisco
GENETICS SELECTION EVOLUTION, 2020, 52 (01)
[44] Hierarchical ensemble deep learning for data-driven lead time prediction
Aslan, Ayse
Vasantha, Gokula
El-Raoui, Hanane
Quigley, John
Hanson, Jack
Corney, Jonathan
Sherlock, Andrew
INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY, 2023, 128 (9-10): : 4169 - 4188
[45] Prognostic Prediction of Gastric Cancer Based on Ensemble Deep Learning of Pathological Images
Jin, Huaiping
Xue, Feiyue
Li, Zhenhui
Tao, Haibo
Wang, Bin
JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2023, 45 (07) : 2623 - 2633
[46] Traffic flow prediction by an ensemble framework with data denoising and deep learning model
Chen, Xinqiang
Chen, Huixing
Yang, Yongsheng
Wu, Huafeng
Zhang, Wenhui
Zhao, Jiansen
Xiong, Yong
PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2021, 565
[47] An Innovative Ensemble Deep Learning Clinical Decision Support System for Diabetes Prediction
Al Reshan, Mana Saleh
Amin, Samina
Zeb, Muhammad Ali
Sulaiman, Adel
Alshahrani, Hani
Shaikh, Asadullah
Elmagzoub, Mohamed A.
IEEE ACCESS, 2024, 12 : 106193 - 106210
[48] EMDLP: Ensemble multiscale deep learning model for RNA methylation site prediction
Honglei Wang
Hui Liu
Tao Huang
Gangshen Li
Lin Zhang
Yanjing Sun
BMC Bioinformatics, 23
[49] Hierarchical ensemble deep learning for data-driven lead time prediction
Ayse Aslan
Gokula Vasantha
Hanane El-Raoui
John Quigley
Jack Hanson
Jonathan Corney
Andrew Sherlock
The International Journal of Advanced Manufacturing Technology, 2023, 128 : 4169 - 4188
[50] EMDLP: Ensemble multiscale deep learning model for RNA methylation site prediction
Wang, Honglei
Liu, Hui
Huang, Tao
Li, Gangshen
Zhang, Lin
Sun, Yanjing
BMC BIOINFORMATICS, 2022, 23 (01)

← 1 2 3 4 5 →