LBCE-XGB: A XGBoost Model for Predicting Linear B-Cell Epitopes Based on BERT Embeddings

被引:4
|
作者
Liu, Yufeng [1 ]
Liu, Yinbo [1 ]
Wang, Shuyu [1 ]
Zhu, Xiaolei [1 ]
机构
[1] Anhui Agr Univ, Sch Sci, Hefei 230036, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
Linear B cell epitope; BERT; XGBoost; Natural language processing; Machine learning; SITES;
D O I
10.1007/s12539-023-00549-z
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Accurately detecting linear B-cell epitopes (BCEs) makes great sense in vaccine design, immunodiagnostic test, antibody production, disease prevention and treatment. Wet-lab experiments for determining linear BCEs are both expensive and laborious, which are not able to meet the recognition needs of modern massive protein sequence data. Instead, computational methods can efficiently identify linear BCEs with low cost. Although several computational methods are available, the performance is still not satisfactory. Thus, we propose a new method, LBCE-XGB, to forecast linear BCEs based on XGBoost algorithm. To represent the biological information concealed in peptide sequences, the embeddings of the residues were obtained from a pre-trained domain-specific BERT model. In addition, the other five types of attributes comprising amino acid composition, amino acid antigenicity scale were also extracted. The best feature combination was determined according to the cross-validation results. Against the models developed by other deep learning and machine learning algorithms, LBCE-XGB achieves the top performance with an AUROC of 0.845 for fivefold cross-validation. The results on the independent test set show that our model attains an AUROC of 0.838 which is substantially higher than other state-of-the-art methods. The outcomes indicate that the representations of BERT could be an effective feature in predicting linear BCEs and we believe that LBCE-XGB could be a useful medium for detecting linear B cell epitopes with high accuracy and low cost.
引用
收藏
页码:293 / 305
页数:13
相关论文
共 50 条
  • [21] Epitopia: a web-server for predicting B-cell epitopes
    Nimrod D Rubinstein
    Itay Mayrose
    Eric Martz
    Tal Pupko
    BMC Bioinformatics, 10
  • [22] COBEpro: a novel system for predicting continuous B-cell epitopes
    Sweredoski, Michael J.
    Baldi, Pierre
    PROTEIN ENGINEERING DESIGN & SELECTION, 2009, 22 (03): : 113 - 120
  • [23] PEASE: predicting B-cell epitopes utilizing antibody sequence
    Sela-Culang, Inbal
    Ashkenazi, Shaul
    Peters, Bjoern
    Ofran, Yanay
    BIOINFORMATICS, 2015, 31 (08) : 1313 - 1315
  • [24] Epitopia: a web-server for predicting B-cell epitopes
    Rubinstein, Nimrod D.
    Mayrose, Itay
    Martz, Eric
    Pupko, Tal
    BMC BIOINFORMATICS, 2009, 10 : 287
  • [25] Prediction of linear B-cell epitopes using AAT scale
    Wang, Lian
    Liu, Juan
    Zhu, Shanfeng
    Gao, YangYang
    2009 3RD INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICAL ENGINEERING, VOLS 1-11, 2009, : 291 - +
  • [26] Identification of B-Cell Linear Epitopes in the Nucleocapsid (N) Protein B-Cell Linear Epitopes Conserved among the Main SARS-CoV-2 Variants
    Rodrigues-da-Silva, Rodrigo N.
    Conte, Fernando P. P.
    da Silva, Gustavo
    Carneiro-Alencar, Ana L. L.
    Gomes, Paula R. R.
    Kuriyama, Sergio N. N.
    Neto, Antonio A. F.
    Lima-Junior, Josue C.
    VIRUSES-BASEL, 2023, 15 (04):
  • [27] PREDICTING POTENTIAL LINEAR B-CELL EPITOPES ON E GLYCOPROTEIN OF DENGUE VIRUS THROUGH IN-SILICO APPROACHES
    Nadugala, Mahesha N.
    Pushpakumara, Pradeep D.
    Premaratne, Prasad H.
    Goonasekara, Charitha L.
    AMERICAN JOURNAL OF TROPICAL MEDICINE AND HYGIENE, 2015, 93 (04): : 233 - 233
  • [28] Prediction of Linear B-Cell Epitopes with mRMR Feature Selection and Analysis
    Li, Bi-Qing
    Zheng, Lu-Lu
    Feng, Kai-Yan
    Hu, Le-Le
    Huang, Guo-Hua
    Chen, Lei
    CURRENT BIOINFORMATICS, 2016, 11 (01) : 22 - 31
  • [29] Machine learning approaches for prediction of linear B-cell epitopes on proteins
    Söllner, Johannes
    Mayer, Bernd
    JOURNAL OF MOLECULAR RECOGNITION, 2006, 19 (03) : 200 - 208
  • [30] Identification of linear B-cell epitopes within Tarp of Chlamydia trachomatis
    Zhu, Shanli
    Feng, Yan
    Chen, Jun
    Lin, Xiaoyun
    Xue, Xiangyang
    Chen, Shao
    Zhong, Xiaozhi
    Li, WenShu
    Zhang, Lifang
    JOURNAL OF PEPTIDE SCIENCE, 2014, 20 (12) : 916 - 922