A robust protein language model for SARS-CoV-2 protein-protein interaction network prediction

被引:8
|
作者
Ozger, Zeynep Banu [1 ]
机构
[1] Sutcu Imam Univ, Dept Comp Engn, TR-46040 Kahramanmaras, Turkiye
关键词
Protein-protein interaction; Protein language model; SARS-CoV-2; Virus-host interaction; Natural language processing;
D O I
10.1016/j.artmed.2023.102574
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Protein-protein interaction is one of the ways viruses interact with their hosts. Therefore, identifying protein interactions between viruses and hosts helps explain how virus proteins work, how they replicate, and how they cause disease. SARS-CoV-2 is a new type of virus that emerged from the coronavirus family in 2019 and caused a worldwide pandemic. Detection of human proteins interacting with this novel virus strain plays an important role in monitoring the cellular process of virus-associated infection.Within the scope of the study, a natural language processing-based collective learning method is proposed for the prediction of potential SARS-CoV-2-human PPIs. Protein language models were obtained with the prediction-based word2Vec and doc2Vec embedding methods and the frequency-based tf-idf method. Known interactions were represented by proposed language models and traditional feature extraction methods (conjoint triad and repeat pattern), and their performances were compared. The interaction data were trained with support vector machine, artificial neural network (ANN), k-nearest neighbor (KNN), naive Bayes (NB), decision tree (DT), and ensemble algorithms. Experimental results show that protein language models are a promising protein representation method for protein-protein interaction prediction. The term frequency-inverse document frequency-based language model performed the SARS-CoV-2 protein-protein interaction estimation with an error of 1.4%. Additionally, the decisions of high-performing learning models for different feature extraction methods were combined with a collective voting approach to make new interaction predictions. For 10,000 human proteins, 285 new potential interactions were predicted, with models combining decisions.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Interaction of SARS-CoV-2 spike protein with amyloid beta
    Izadpanah, Amin
    Alberts, Julie
    Rappaport, Jay
    Datta, Prasun
    JOURNAL OF MEDICAL PRIMATOLOGY, 2023, 52 (05) : 342 - 342
  • [32] Protein Function Prediction Using Function Associations in Protein-Protein Interaction Network
    Sun, Pingping
    Tan, Xian
    Guo, Sijia
    Zhang, Jingbo
    Sun, Bojian
    Du, Ning
    Wang, Han
    Sun, Hui
    IEEE ACCESS, 2018, 6 : 30892 - 30902
  • [33] Connecting the dots on vertical transmission of SARS-CoV-2 using protein-protein interaction network analysis - Potential roles of placental ACE2 and ENDOU
    Jing, Hongwu
    Ackerman, William E.
    Zhao, Guomao
    El Helou, Yara
    Buhimschi, Catalin S.
    Buhimschi, Irina A.
    PLACENTA, 2021, 104 : 16 - 19
  • [34] Deep-learning-enabled protein–protein interaction analysis for prediction of SARS-CoV-2 infectivity and variant evolution
    Guangyu Wang
    Xiaohong Liu
    Kai Wang
    Yuanxu Gao
    Gen Li
    Daniel T. Baptista-Hon
    Xiaohong Helena Yang
    Kanmin Xue
    Wa Hou Tai
    Zeyu Jiang
    Linling Cheng
    Manson Fok
    Johnson Yiu-Nam Lau
    Shengyong Yang
    Ligong Lu
    Ping Zhang
    Kang Zhang
    Nature Medicine, 2023, 29 : 2007 - 2018
  • [35] Native Structure-Based Peptides as Potential Protein-Protein Interaction Inhibitors of SARS-CoV-2 Spike Protein and Human ACE2 Receptor
    Odolczyk, Norbert
    Marzec, Ewa
    Winiewska-Szajewska, Maria
    Poznanski, Jaroslaw
    Zielenkiewicz, Piotr
    MOLECULES, 2021, 26 (08):
  • [36] SARS-CoV-2 nucleocapsid protein triggers hyperinflammation via protein-protein interaction-mediated intracellular Cl- accumulation in respiratory epithelium
    Chen, Lei
    Guan, Wei-Jie
    Qiu, Zhuo-Er
    Xu, Jian-Bang
    Bai, Xu
    Hou, Xiao-Chun
    Sun, Jing
    Qu, Su
    Huang, Ze-Xin
    Lei, Tian-Lun
    Huang, Zi-Yang
    Zhao, Jincun
    Zhu, Yun-Xin
    Ye, Ke-Nan
    Lun, Zhao-Rong
    Zhou, Wen-Liang
    Zhong, Nan-Shan
    Zhang, Yi-Lin
    SIGNAL TRANSDUCTION AND TARGETED THERAPY, 2022, 7 (01)
  • [37] Human protein-protein interaction prediction
    Mark D McDowall
    Michelle S Scott
    Geoffrey J Barton
    BMC Bioinformatics, 11 (Suppl 10)
  • [38] Computational Identification of Human Biological Processes and Protein Sequence Motifs Putatively Targeted by SARS-CoV-2 Proteins Using Protein-Protein Interaction Networks
    Nadeau, Rachel
    Fard, Soroush Shahryari
    Scheer, Amit
    Hashimoto-Roth, Emily
    Nygard, Dallas
    Abramchuk, Iryna
    Chung, Yun-En
    Bennett, Steffany A. L.
    Lavallee-Adam, Mathieu
    JOURNAL OF PROTEOME RESEARCH, 2020, 19 (11) : 4553 - 4566
  • [39] A Protein Co-Conservation Network Model Characterizes Mutation Effects on SARS-CoV-2 Spike Protein
    Zeng, Lianjie
    Lu, Yitan
    Yan, Wenying
    Yang, Yang
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2023, 24 (04)
  • [40] Protein-protein Interaction Prediction and Assessment from Model Organisms
    Lin, Xiaotong
    Liu, Mei
    Chen, Xue-Wen
    2008 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, PROCEEDINGS, 2008, : 187 - 192