LBCE-XGB: A XGBoost Model for Predicting Linear B-Cell Epitopes Based on BERT Embeddings

被引:4
|
作者
Liu, Yufeng [1 ]
Liu, Yinbo [1 ]
Wang, Shuyu [1 ]
Zhu, Xiaolei [1 ]
机构
[1] Anhui Agr Univ, Sch Sci, Hefei 230036, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
Linear B cell epitope; BERT; XGBoost; Natural language processing; Machine learning; SITES;
D O I
10.1007/s12539-023-00549-z
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Accurately detecting linear B-cell epitopes (BCEs) makes great sense in vaccine design, immunodiagnostic test, antibody production, disease prevention and treatment. Wet-lab experiments for determining linear BCEs are both expensive and laborious, which are not able to meet the recognition needs of modern massive protein sequence data. Instead, computational methods can efficiently identify linear BCEs with low cost. Although several computational methods are available, the performance is still not satisfactory. Thus, we propose a new method, LBCE-XGB, to forecast linear BCEs based on XGBoost algorithm. To represent the biological information concealed in peptide sequences, the embeddings of the residues were obtained from a pre-trained domain-specific BERT model. In addition, the other five types of attributes comprising amino acid composition, amino acid antigenicity scale were also extracted. The best feature combination was determined according to the cross-validation results. Against the models developed by other deep learning and machine learning algorithms, LBCE-XGB achieves the top performance with an AUROC of 0.845 for fivefold cross-validation. The results on the independent test set show that our model attains an AUROC of 0.838 which is substantially higher than other state-of-the-art methods. The outcomes indicate that the representations of BERT could be an effective feature in predicting linear BCEs and we believe that LBCE-XGB could be a useful medium for detecting linear B cell epitopes with high accuracy and low cost.
引用
收藏
页码:293 / 305
页数:13
相关论文
共 50 条
  • [41] BCEDB: a linear B-cell epitopes database for SARS-CoV-2
    Tai, Chengzheng
    Li, Hongjun
    Zhang, Jing
    DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2023, 2023
  • [42] Localization of linear B-cell epitopes on infectious bronchitis virus nucleocapsid protein
    Seah, JN
    Yu, L
    Kwang, J
    VETERINARY MICROBIOLOGY, 2000, 75 (01) : 11 - 16
  • [43] Using random forest to classify linear B-cell epitopes based on amino acid properties and molecular features
    Huang, Jian-Hua
    Wen, Ming
    Tang, Li-Juan
    Xie, Hua-Lin
    Fu, Liang
    Liang, Yi-Zeng
    Lu, Hong-Mei
    BIOCHIMIE, 2014, 103 : 1 - 6
  • [44] Identification of linear human B-cell epitopes of tick-borne encephalitis virus
    Kuivanen, Suvi
    Hepojoki, Jussi
    Vene, Sirkka
    Vaheri, Antti
    Vapalahti, Olli
    VIROLOGY JOURNAL, 2014, 11
  • [45] Prediction of linear B-cell epitopes using amino acid pair antigenicity scale
    J. Chen
    H. Liu
    J. Yang
    K.-C. Chou
    Amino Acids, 2007, 33 : 423 - 428
  • [46] iLBE for Computational Identification of Linear B-cell Epitopes by Integrating Sequence and Evolutionary Features
    Mehedi Hasan
    Shamima Khatun
    Hiroyuki Kurata
    Genomics,Proteomics & Bioinformatics, 2020, 18 (05) : 593 - 600
  • [47] NetBCE: An Interpretable Deep Neural Network for Accurate Prediction of Linear B-cell Epitopes
    Xu, Haodong
    Zhao, Zhongming
    GENOMICS PROTEOMICS & BIOINFORMATICS, 2022, 20 (05) : 1002 - 1012
  • [48] Identification of linear human B-cell epitopes of tick-borne encephalitis virus
    Suvi Kuivanen
    Jussi Hepojoki
    Sirkka Vene
    Antti Vaheri
    Olli Vapalahti
    Virology Journal, 11
  • [49] NetBCE: An Interpretable Deep Neural Network for Accurate Prediction of Linear B-cell Epitopes
    Haodong Xu
    Zhongming Zhao
    Genomics,Proteomics & Bioinformatics, 2022, Proteomics & Bioinformatics2022 (05) : 1002 - 1012
  • [50] Identification of linear B-cell epitopes on goose parvovirus non-structural protein
    Yu, Tian-fei
    Ma, Bo
    Wang, Jun-wei
    VETERINARY IMMUNOLOGY AND IMMUNOPATHOLOGY, 2016, 179 : 85 - 88