A robust protein language model for SARS-CoV-2 protein-protein interaction network prediction

被引:8
|
作者
Ozger, Zeynep Banu [1 ]
机构
[1] Sutcu Imam Univ, Dept Comp Engn, TR-46040 Kahramanmaras, Turkiye
关键词
Protein-protein interaction; Protein language model; SARS-CoV-2; Virus-host interaction; Natural language processing;
D O I
10.1016/j.artmed.2023.102574
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Protein-protein interaction is one of the ways viruses interact with their hosts. Therefore, identifying protein interactions between viruses and hosts helps explain how virus proteins work, how they replicate, and how they cause disease. SARS-CoV-2 is a new type of virus that emerged from the coronavirus family in 2019 and caused a worldwide pandemic. Detection of human proteins interacting with this novel virus strain plays an important role in monitoring the cellular process of virus-associated infection.Within the scope of the study, a natural language processing-based collective learning method is proposed for the prediction of potential SARS-CoV-2-human PPIs. Protein language models were obtained with the prediction-based word2Vec and doc2Vec embedding methods and the frequency-based tf-idf method. Known interactions were represented by proposed language models and traditional feature extraction methods (conjoint triad and repeat pattern), and their performances were compared. The interaction data were trained with support vector machine, artificial neural network (ANN), k-nearest neighbor (KNN), naive Bayes (NB), decision tree (DT), and ensemble algorithms. Experimental results show that protein language models are a promising protein representation method for protein-protein interaction prediction. The term frequency-inverse document frequency-based language model performed the SARS-CoV-2 protein-protein interaction estimation with an error of 1.4%. Additionally, the decisions of high-performing learning models for different feature extraction methods were combined with a collective voting approach to make new interaction predictions. For 10,000 human proteins, 285 new potential interactions were predicted, with models combining decisions.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] An exploration on the topologies of SARS-CoV-2/human protein-protein interaction network
    Navish, A. A.
    Uthayakumar, R.
    JOURNAL OF BIOMOLECULAR STRUCTURE & DYNAMICS, 2023, 41 (13): : 6313 - 6325
  • [2] Improving protein-protein interaction prediction using protein language model and protein network features
    Hu, Jun
    Li, Zhe
    Rao, Bing
    Thafar, Maha A.
    Arif, Muhammad
    ANALYTICAL BIOCHEMISTRY, 2024, 693
  • [3] Targeting protein-protein interaction interfaces with antiviral N protein inhibitor in SARS-CoV-2
    Hong, Jhen-Yi
    Lin, Shih-Chao
    Kehn-Hall, Kylene
    Zhang, Kai-Min
    Luo, Shun-Yuan
    Wu, Hung-Yi
    Chang, Sui-Yuan
    Hou, Ming-Hon
    BIOPHYSICAL JOURNAL, 2024, 123 (04) : 478 - 488
  • [4] Deep-learning-enabled protein-protein interaction analysis for prediction of SARS-CoV-2 infectivity and variant evolution
    Wang, Guangyu
    Liu, Xiaohong
    Wang, Kai
    Gao, Yuanxu
    Li, Gen
    Baptista-Hon, Daniel T.
    Yang, Xiaohong Helena
    Xue, Kanmin
    Tai, Wa Hou
    Jiang, Zeyu
    Cheng, Linling
    Fok, Manson
    Lau, Johnson Yiu-Nam
    Yang, Shengyong
    Lu, Ligong
    Zhang, Ping
    Zhang, Kang
    NATURE MEDICINE, 2023, 29 (08) : 2007 - +
  • [5] An approach to cellular tropism of SARS-CoV-2 through protein-protein interaction and enrichment analysis
    Ortega-Bernal, Daniel
    Zarate, Selene
    de los Angeles Martinez-Cardenas, Maria
    Bojalil, Rafael
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [6] SARS-CoV-2 ORF8 Accessory Protein Dimerization Domains and Protein-Protein Host Interaction
    Robinson, Allison
    Peterson, Ryan
    Stearns, Leeann
    Vazquetelles, Ryan
    Wiles, Elizabeth
    Hart, Bailey
    Guzman, Karen
    FASEB JOURNAL, 2021, 35
  • [7] Effects of SARS-CoV-2 mutations on protein structures and intraviral protein-protein interactions
    Wu, Siqi
    Tian, Chang
    Liu, Panpan
    Guo, Dongjie
    Zheng, Wei
    Huang, Xiaoqiang
    Zhang, Yang
    Liu, Lijun
    JOURNAL OF MEDICAL VIROLOGY, 2021, 93 (04) : 2132 - 2140
  • [8] Biomechanical characterization of SARS-CoV-2 spike RBD and human ACE2 protein-protein interaction
    Cao, Wenpeng
    Dong, Chuqiao
    Kim, Seonghan
    Hou, Decheng
    Tai, Wanbo
    Du, Lanying
    Im, Wonpil
    Zhang, X. Frank
    BIOPHYSICAL JOURNAL, 2021, 120 (06) : 1011 - 1019
  • [9] In Vitro Reconstitution and Analysis of SARS-CoV-2/Host Protein-Protein Interactions
    Moradi, Shayli Varasteh
    Wu, Yue
    Walden, Patricia
    Cui, Zhenling
    Johnston, Wayne A.
    Petrov, Dmitri
    Alexandrov, Kirill
    ACS OMEGA, 2023, 8 (28): : 25009 - 25019
  • [10] Protein Engineering in the Design of Protein-Protein Interactions: SARS-CoV-2 Inhibitors as a Test Case
    Zahradnik, Jiri
    Schreiber, Gideon
    BIOCHEMISTRY, 2021, 60 (46) : 3429 - 3435