LBCE-XGB: A XGBoost Model for Predicting Linear B-Cell Epitopes Based on BERT Embeddings

被引:4
|
作者
Liu, Yufeng [1 ]
Liu, Yinbo [1 ]
Wang, Shuyu [1 ]
Zhu, Xiaolei [1 ]
机构
[1] Anhui Agr Univ, Sch Sci, Hefei 230036, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
Linear B cell epitope; BERT; XGBoost; Natural language processing; Machine learning; SITES;
D O I
10.1007/s12539-023-00549-z
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Accurately detecting linear B-cell epitopes (BCEs) makes great sense in vaccine design, immunodiagnostic test, antibody production, disease prevention and treatment. Wet-lab experiments for determining linear BCEs are both expensive and laborious, which are not able to meet the recognition needs of modern massive protein sequence data. Instead, computational methods can efficiently identify linear BCEs with low cost. Although several computational methods are available, the performance is still not satisfactory. Thus, we propose a new method, LBCE-XGB, to forecast linear BCEs based on XGBoost algorithm. To represent the biological information concealed in peptide sequences, the embeddings of the residues were obtained from a pre-trained domain-specific BERT model. In addition, the other five types of attributes comprising amino acid composition, amino acid antigenicity scale were also extracted. The best feature combination was determined according to the cross-validation results. Against the models developed by other deep learning and machine learning algorithms, LBCE-XGB achieves the top performance with an AUROC of 0.845 for fivefold cross-validation. The results on the independent test set show that our model attains an AUROC of 0.838 which is substantially higher than other state-of-the-art methods. The outcomes indicate that the representations of BERT could be an effective feature in predicting linear BCEs and we believe that LBCE-XGB could be a useful medium for detecting linear B cell epitopes with high accuracy and low cost.
引用
收藏
页码:293 / 305
页数:13
相关论文
共 50 条
  • [31] Localization of linear B-cell epitopes on goose parvovirus structural protein
    Yu, Tian-fei
    Ma, Bo
    Gao, Ming-chun
    Wang, Jun-wei
    VETERINARY IMMUNOLOGY AND IMMUNOPATHOLOGY, 2012, 145 (1-2) : 522 - 526
  • [32] SVM-based prediction of linear B-cell epitopes using Bayes Feature Extraction
    Lawrence JK Wee
    Diane Simarmata
    Yiu-Wing Kam
    Lisa FP Ng
    Joo Chuan Tong
    BMC Genomics, 11
  • [33] SVM-based prediction of linear B-cell epitopes using Bayes Feature Extraction
    Wee, Lawrence J. K.
    Simarmata, Diane
    Kam, Yiu-Wing
    Ng, Lisa F. P.
    Tong, Joo Chuan
    BMC GENOMICS, 2010, 11
  • [34] MAPPING OF LINEAR B-CELL EPITOPES OF HEPATITIS-B SURFACE-ANTIGEN
    COURSAGET, P
    LESAGE, G
    LECANN, P
    MAYELO, V
    BOURDIL, C
    RESEARCH IN VIROLOGY, 1991, 142 (06): : 461 - 467
  • [35] DeepLBCEPred: A Bi-LSTM and multi-scale CNN-based deep learning method for predicting linear B-cell epitopes
    Qi, Yue
    Zheng, Peijie
    Huang, Guohua
    FRONTIERS IN MICROBIOLOGY, 2023, 14
  • [36] Prediction of linear B-cell epitopes of hepatitis C virus for vaccine development
    Huang, Wen-Lin
    Tsai, Ming-Ju
    Hsu, Kai-Ti
    Wang, Jyun-Rong
    Chen, Yi-Hsiung
    Ho, Shinn-Ying
    BMC MEDICAL GENOMICS, 2015, 8
  • [37] Estimation and extraction of B-cell linear epitopes predicted by mathematical morphology approaches
    Chang, H. -T.
    Liu, C. -H.
    Pai, T. -W.
    FEBS JOURNAL, 2008, 275 : 421 - 421
  • [38] Prediction of linear B-cell epitopes of hepatitis C virus for vaccine development
    Wen-Lin Huang
    Ming-Ju Tsai
    Kai-Ti Hsu
    Jyun-Rong Wang
    Yi-Hsiung Chen
    Shinn-Ying Ho
    BMC Medical Genomics, 8
  • [39] Bioinformatics-based design of novel antigenic B-cell linear epitopes of Deinagkistrodon acutus venom
    Cao, Y. -L
    Guo, G. -N.
    Zhu, G. -Y.
    Tian, Z.
    Gou, Y. -J.
    Chen, C.
    Liu, M. -H.
    EUROPEAN REVIEW FOR MEDICAL AND PHARMACOLOGICAL SCIENCES, 2016, 20 (04) : 781 - 787
  • [40] Estimation and extraction of B-cell linear epitopes predicted by mathematical morphology approaches
    Chang, Hao-Teng
    Liu, Chih-Hong
    Pai, Tun-Wen
    JOURNAL OF MOLECULAR RECOGNITION, 2008, 21 (06) : 431 - 441