VirusHound-I: prediction of viral proteins involved in the evasion of host adaptive immune response using the random forest algorithm and generative adversarial network for data augmentation

被引：1

作者：

Beltran, Jorge F. ^{[1
]}

Belen, Lisandra Herrera ^{[2
]}

Farias, Jorge G. ^{[3
]}

Zamorano, Mauricio ^{[4
]}

Lefin, Nicolas ^{[5
]}

Miranda, Javiera ^{[5
]}

Parraguez-Contreras, Fernanda ^{[5
]}

机构：

[1] Univ La Frontera, Fac Engn & Sci, Dept Chem Engn, Ave Francisco Salazar 01145, Temuco, Chile

[2] Univ Santo Tomas Temuco, Temuco, Chile

[3] Univ La Frontera, Fac Engn & Sci, Temuco, Chile

[4] Univ La Frontera Temuco, Dept Chem Engn, Temuco, Chile

[5] Univ La Frontera, Temuco, Chile

来源：

BRIEFINGS IN BIOINFORMATICS | 2024年 / 25卷 / 01期

关键词：

virus; pathogen; machine learning; neural network; deep learning; protein; SUBCELLULAR-LOCALIZATION; STRATEGIES; MECHANISMS;

D O I：

10.1093/bib/bbad434

中图分类号：

Q5 [生物化学];

学科分类号：

071010 ; 081704 ;

摘要：

Throughout evolution, pathogenic viruses have developed different strategies to evade the response of the adaptive immune system. To carry out successful replication, some pathogenic viruses encode different proteins that manipulate the molecular mechanisms of host cells. Currently, there are different bioinformatics tools for virus research; however, none of them focus on predicting viral proteins that evade the adaptive system. In this work, we have developed a novel tool based on machine and deep learning for predicting this type of viral protein named VirusHound-I. This tool is based on a model developed with the multilayer perceptron algorithm using the dipeptide composition molecular descriptor. In this study, we have also demonstrated the robustness of our strategy for data augmentation of the positive dataset based on generative antagonistic networks. During the 10-fold cross-validation step in the training dataset, the predictive model showed 0.947 accuracy, 0.994 precision, 0.943 F1 score, 0.995 specificity, 0.896 sensitivity, 0.894 kappa, 0.898 Matthew's correlation coefficient and 0.989 AUC. On the other hand, during the testing step, the model showed 0.964 accuracy, 1.0 precision, 0.967 F1 score, 1.0 specificity, 0.936 sensitivity, 0.929 kappa, 0.931 Matthew's correlation coefficient and 1.0 AUC. Taking this model into account, we have developed a tool called VirusHound-I that makes it possible to predict viral proteins that evade the host's adaptive immune system. We believe that VirusHound-I can be very useful in accelerating studies on the molecular mechanisms of evasion of pathogenic viruses, as well as in the discovery of therapeutic targets.

引用

页数：8

共 1 条

[1] Prediction of soft tissue sarcoma response to radiotherapy using longitudinal diffusion MRI and a deep neural network with generative adversarial network-based data augmentation
Gao, Yu
Ghodrati, Vahid
Kalbasi, Anusha
Fu, Jie
Ruan, Dan
Cao, Minsong
Wang, Chenyang
Eilber, Fritz C.
Bernthal, Nicholas
Bukata, Susan
Dry, Sarah M.
Nelson, Scott D.
Kamrava, Mitchell
Lewis, John
Low, Daniel A.
Steinberg, Michael
Hu, Peng
Yang, Yingli
MEDICAL PHYSICS, 2021, 48 (06) : 3262 - 3272

← 1 →