StackTTCA: a stacking ensemble learning-based framework for accurate and high-throughput identification of tumor T cell antigens

被引:2
|
作者
Charoenkwan, Phasit [1 ]
Schaduangrat, Nalini [2 ]
Shoombuatong, Watshara [2 ]
机构
[1] Chiang Mai Univ, Coll Arts Media & Technol, Modern Management & Informat Technol, Chiang Mai 50200, Thailand
[2] Mahidol Univ, Fac Med Technol, Ctr Res Innovat & Biomed Informat, Bangkok 10700, Thailand
关键词
T-cell antigen; Bioinformatics; Stacking strategy; Feature selection; Machine learning; PREDICTION;
D O I
10.1186/s12859-023-05421-x
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background : The identification of tumor T cell antigens (TTCAs) is crucial for providing insights into their functional mechanisms and utilizing their potential in anticancer vaccines development. In this context, TTCAs are highly promising. Meanwhile, experimental technologies for discovering and characterizing new TTCAs are expensive and time-consuming. Although many machine learning (ML)-based models have been proposed for identifying new TTCAs, there is still a need to develop a robust model that can achieve higher rates of accuracy and precision.Results : In this study, we propose a new stacking ensemble learning-based framework, termed StackTTCA, for accurate and large-scale identification of TTCAs. Firstly, we constructed 156 different baseline models by using 12 different feature encoding schemes and 13 popular ML algorithms. Secondly, these baseline models were trained and employed to create a new probabilistic feature vector. Finally, the optimal probabilistic feature vector was determined based the feature selection strategy and then used for the construction of our stacked model. Comparative benchmarking experiments indicated that StackTTCA clearly outperformed several ML classifiers and the existing methods in terms of the independent test, with an accuracy of 0.932 and Matthew's correlation coefficient of 0.866.Conclusions : In summary, the proposed stacking ensemble learning-based framework of StackTTCA could help to precisely and rapidly identify true TTCAs for follow-up experimental verification. In addition, we developed an online web server () to maximize user convenience for high-throughput screening of novel TTCAs.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Robust and accurate performance anomaly detection and prediction for cloud applications: a novel ensemble learning-based framework
    Xin, Ruyue
    Liu, Hongyun
    Chen, Peng
    Zhao, Zhiming
    JOURNAL OF CLOUD COMPUTING-ADVANCES SYSTEMS AND APPLICATIONS, 2023, 12 (01):
  • [32] iEnhancer-SKNN: a stacking ensemble learning-based method for enhancer identification and classification using sequence information
    Wu, Hao
    Liu, Mengdi
    Zhang, Pengyu
    Zhang, Hongming
    BRIEFINGS IN FUNCTIONAL GENOMICS, 2023, 22 (03) : 302 - 311
  • [33] M2F-Net: A Deep Learning-Based Multimodal Classification with High-Throughput Phenotyping for Identification of Overabundance of Fertilizers
    Dhakshayani, J.
    Surendiran, B.
    AGRICULTURE-BASEL, 2023, 13 (06):
  • [34] An ensemble learning-based feature selection algorithm for identification of biomarkers of renal cell carcinoma
    Xin, Zekun
    Lv, Ruhong
    Liu, Wei
    Wang, Shenghan
    Gao, Qiang
    Zhang, Bao
    Sun, Guangyu
    PEERJ COMPUTER SCIENCE, 2024, 10 : 1 - 27
  • [35] Deep learning-based elaiosome detection in milk thistle seed for efficient high-throughput phenotyping
    Kim, Younguk
    Abebe, Alebel Mekuriaw
    Kim, Jaeyoung
    Hong, Suyoung
    An, Kwanghoon
    Shim, Jeehyoung
    Baek, Jeongho
    FRONTIERS IN PLANT SCIENCE, 2024, 15
  • [36] PlantNet: transfer learning-based fine-grained network for high-throughput plants recognition
    Yang, Ziying
    He, Wenyan
    Fan, Xijian
    Tjahjadi, Tardi
    SOFT COMPUTING, 2022, 26 (20) : 10581 - 10590
  • [37] PlantNet: transfer learning-based fine-grained network for high-throughput plants recognition
    Ziying Yang
    Wenyan He
    Xijian Fan
    Tardi Tjahjadi
    Soft Computing, 2022, 26 : 10581 - 10590
  • [38] A novel method for high-throughput discovery of neo-antigens and corresponding T-cell receptors
    Peng, Songming
    Zaretsky, Jesse
    Bethune, Michael T.
    Hsu, Alice
    Baltimore, David
    Ribas, Antoni
    Heath, James
    CANCER RESEARCH, 2017, 77
  • [39] A Framework for Annotation of Antigen Specificities in High-Throughput T-Cell Repertoire Sequencing Studies
    Pogorelyy, Mikhail V.
    Shugay, Mikhail
    FRONTIERS IN IMMUNOLOGY, 2019, 10
  • [40] Identification of Novel Mast Cell Activators Using Cell-Based High-Throughput Screening
    Choi, Hae Woong
    Chan, Cliburn
    Shterev, Ivo D.
    Lynch, Heather E.
    Robinette, Taylor J.
    Johnson-Weaver, Brandi T.
    Shi, Jianling
    Sempowski, Gregory D.
    Kim, So Young
    Dickson, John K.
    Gooden, David M.
    Abraham, Soman N.
    Staats, Herman F.
    SLAS DISCOVERY, 2019, 24 (06) : 628 - 640