A robust protein language model for SARS-CoV-2 protein-protein interaction network prediction

被引:8
|
作者
Ozger, Zeynep Banu [1 ]
机构
[1] Sutcu Imam Univ, Dept Comp Engn, TR-46040 Kahramanmaras, Turkiye
关键词
Protein-protein interaction; Protein language model; SARS-CoV-2; Virus-host interaction; Natural language processing;
D O I
10.1016/j.artmed.2023.102574
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Protein-protein interaction is one of the ways viruses interact with their hosts. Therefore, identifying protein interactions between viruses and hosts helps explain how virus proteins work, how they replicate, and how they cause disease. SARS-CoV-2 is a new type of virus that emerged from the coronavirus family in 2019 and caused a worldwide pandemic. Detection of human proteins interacting with this novel virus strain plays an important role in monitoring the cellular process of virus-associated infection.Within the scope of the study, a natural language processing-based collective learning method is proposed for the prediction of potential SARS-CoV-2-human PPIs. Protein language models were obtained with the prediction-based word2Vec and doc2Vec embedding methods and the frequency-based tf-idf method. Known interactions were represented by proposed language models and traditional feature extraction methods (conjoint triad and repeat pattern), and their performances were compared. The interaction data were trained with support vector machine, artificial neural network (ANN), k-nearest neighbor (KNN), naive Bayes (NB), decision tree (DT), and ensemble algorithms. Experimental results show that protein language models are a promising protein representation method for protein-protein interaction prediction. The term frequency-inverse document frequency-based language model performed the SARS-CoV-2 protein-protein interaction estimation with an error of 1.4%. Additionally, the decisions of high-performing learning models for different feature extraction methods were combined with a collective voting approach to make new interaction predictions. For 10,000 human proteins, 285 new potential interactions were predicted, with models combining decisions.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] LPBERT: A Protein-Protein Interaction Prediction Method Based on a Pre-Trained Language Model
    Hu, An
    Kuang, Linai
    Yang, Dinghai
    APPLIED SCIENCES-BASEL, 2025, 15 (06):
  • [42] Evolutionary analysis and interaction prediction for protein-protein interaction network in geometric space
    Huang, Lei
    Liao, Li
    Wu, Cathy H.
    PLOS ONE, 2017, 12 (09):
  • [43] Network analysis of protein-protein interaction
    Chang Shan
    Gong XinQi
    Jiao Xiong
    Li ChunHua
    Chen WeiZu
    Wang CunXin
    CHINESE SCIENCE BULLETIN, 2010, 55 (09): : 814 - 822
  • [44] Meningioma Protein-Protein Interaction Network
    Zali, Hakimeh
    Tavirani, Mostafa Rezaei
    ARCHIVES OF IRANIAN MEDICINE, 2014, 17 (04) : 262 - 272
  • [45] An approach to cellular tropism of SARS-CoV-2 through protein–protein interaction and enrichment analysis
    Daniel Ortega-Bernal
    Selene Zarate
    Maria de los Ángeles Martinez-Cárdenas
    Rafael Bojalil
    Scientific Reports, 12
  • [46] Protein-Protein Interaction Prediction for Targeted Protein Degradation
    Orasch, Oliver
    Weber, Noah
    Mueller, Michael
    Amanzadi, Amir
    Gasbarri, Chiara
    Trummer, Christopher
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2022, 23 (13)
  • [47] Enhancing Cancer Driver Gene Prediction by Protein-Protein Interaction Network
    Liu, Chuang
    Dai, Yao
    Yu, Keping
    Zhang, Zi-Ke
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2022, 19 (04) : 2231 - 2240
  • [48] Graph Neural Network for Protein-Protein Interaction Prediction: A Comparative Study
    Zhou, Hang
    Wang, Weikun
    Jin, Jiayun
    Zheng, Zengwei
    Zhou, Binbin
    MOLECULES, 2022, 27 (18):
  • [49] Structure-based prediction of protein-protein interaction network in rice
    Sun, Fangnan
    Deng, Yaxin
    Ma, Xiaosong
    Liu, Yuan
    Zhao, Lingxia
    Yu, Shunwu
    Zhang, Lida
    GENETICS AND MOLECULAR BIOLOGY, 2024, 47 (01)
  • [50] Prediction of Enzyme's Family Based on Protein-Protein Interaction Network
    Niu, Bing
    Lu, Yin
    Lu, Jing
    Chen, Fuxue
    Zhao, Tonghui
    Liu, Zhanmin
    Huang, Tao
    Zhang, Yuhui
    CURRENT BIOINFORMATICS, 2015, 10 (01) : 16 - 21