StackTTCA: a stacking ensemble learning-based framework for accurate and high-throughput identification of tumor T cell antigens

被引:2
|
作者
Charoenkwan, Phasit [1 ]
Schaduangrat, Nalini [2 ]
Shoombuatong, Watshara [2 ]
机构
[1] Chiang Mai Univ, Coll Arts Media & Technol, Modern Management & Informat Technol, Chiang Mai 50200, Thailand
[2] Mahidol Univ, Fac Med Technol, Ctr Res Innovat & Biomed Informat, Bangkok 10700, Thailand
关键词
T-cell antigen; Bioinformatics; Stacking strategy; Feature selection; Machine learning; PREDICTION;
D O I
10.1186/s12859-023-05421-x
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background : The identification of tumor T cell antigens (TTCAs) is crucial for providing insights into their functional mechanisms and utilizing their potential in anticancer vaccines development. In this context, TTCAs are highly promising. Meanwhile, experimental technologies for discovering and characterizing new TTCAs are expensive and time-consuming. Although many machine learning (ML)-based models have been proposed for identifying new TTCAs, there is still a need to develop a robust model that can achieve higher rates of accuracy and precision.Results : In this study, we propose a new stacking ensemble learning-based framework, termed StackTTCA, for accurate and large-scale identification of TTCAs. Firstly, we constructed 156 different baseline models by using 12 different feature encoding schemes and 13 popular ML algorithms. Secondly, these baseline models were trained and employed to create a new probabilistic feature vector. Finally, the optimal probabilistic feature vector was determined based the feature selection strategy and then used for the construction of our stacked model. Comparative benchmarking experiments indicated that StackTTCA clearly outperformed several ML classifiers and the existing methods in terms of the independent test, with an accuracy of 0.932 and Matthew's correlation coefficient of 0.866.Conclusions : In summary, the proposed stacking ensemble learning-based framework of StackTTCA could help to precisely and rapidly identify true TTCAs for follow-up experimental verification. In addition, we developed an online web server () to maximize user convenience for high-throughput screening of novel TTCAs.
引用
收藏
页数:16
相关论文
共 50 条
  • [41] scHiCStackL: a stacking ensemble learning-based method for single-cell Hi-C classification using cell embedding
    Wu, Hao
    Wu, Yingfu
    Jiang, Yuhong
    Zhou, Bing
    Zhou, Haoru
    Chen, Zhongli
    Xiong, Yi
    Liu, Quanzhong
    Zhang, Hongming
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (01)
  • [42] HIGH-THROUGHPUT SEQUENCING OF TUMOR-ASSOCIATED T CELL RECEPTORS IN HUMAN AND MURINE GLIOMA
    Sims, Jennifer
    Grinshpun, Boris
    Feng, Yaping
    Amendolara, Benjamin
    Shen, Yufeng
    Canoll, Peter
    Sims, Peter
    Bruce, Jeffrey
    NEURO-ONCOLOGY, 2013, 15 : 66 - 66
  • [43] StackEPI: identification of cell line-specific enhancer–promoter interactions based on stacking ensemble learning
    Yongxian Fan
    Binchao Peng
    BMC Bioinformatics, 23
  • [44] Identification of Inhibitors of the Association of ZAP-70 with the T Cell Receptor by High-Throughput Screen
    Visperas, Patrick R.
    Wilson, Christopher G.
    Winger, Jonathan A.
    Yan, Qingrong
    Lin, Kevin
    Arkin, Michelle R.
    Weiss, Arthur
    Kuriyan, John
    SLAS DISCOVERY, 2017, 22 (03) : 324 - 331
  • [45] Deep learning-based high-throughput phenotyping can drive future discoveries in plant reproductive biology
    Warman, Cedar
    Fowler, John E.
    PLANT REPRODUCTION, 2021, 34 (02) : 81 - 89
  • [46] High-Throughput Measurement and Machine Learning-Based Prediction of Collision Cross Sections for Drugs and Drug Metabolites
    Ross, Dylan H.
    Seguin, Ryan P.
    Krinsky, Allison M.
    Xu, Libin
    JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 2022, 33 (06) : 1061 - 1072
  • [47] Deep learning-based high-throughput phenotyping can drive future discoveries in plant reproductive biology
    Cedar Warman
    John E. Fowler
    Plant Reproduction, 2021, 34 : 81 - 89
  • [48] Identification of novel regulators of apoptosis using a high-throughput cell-based screen
    Park, Kyung Mi
    Kang, Eunju
    Jeon, Yeo-Jin
    Kim, Nayoung
    Kim, Nam-Soon
    Yoo, Hyang-Sook
    Yeom, Young Il
    Kim, Soo Jung
    MOLECULES AND CELLS, 2007, 23 (02) : 170 - 174
  • [49] High-throughput identification of T-lymphocyte antigens from Anaplasma marginale expressed using in vitro transcription and translation
    Lopez, Job E.
    Beare, Paul A.
    Heinzen, Robert A.
    Norimine, Junzo
    Lahmers, Kevin K.
    Palmer, Guy H.
    Brown, Wendy C.
    JOURNAL OF IMMUNOLOGICAL METHODS, 2008, 332 (1-2) : 129 - 141
  • [50] High-throughput identification of naturally occurring T-cell receptors with therapeutic potential against tumor-associated, viral and neoantigens
    Klinger, Mark
    Ebert, Peter
    Osborne, Edward
    Taniguchi, Ruth
    Hu, Joyce
    Hayes, Tim
    Benzeno, Sharon
    Carbo, Adria
    Laur, Melanie
    Eggers, Erica
    Robins, Harlan
    CANCER IMMUNOLOGY RESEARCH, 2019, 7 (02)