StackTTCA: a stacking ensemble learning-based framework for accurate and high-throughput identification of tumor T cell antigens

被引:2
|
作者
Charoenkwan, Phasit [1 ]
Schaduangrat, Nalini [2 ]
Shoombuatong, Watshara [2 ]
机构
[1] Chiang Mai Univ, Coll Arts Media & Technol, Modern Management & Informat Technol, Chiang Mai 50200, Thailand
[2] Mahidol Univ, Fac Med Technol, Ctr Res Innovat & Biomed Informat, Bangkok 10700, Thailand
关键词
T-cell antigen; Bioinformatics; Stacking strategy; Feature selection; Machine learning; PREDICTION;
D O I
10.1186/s12859-023-05421-x
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background : The identification of tumor T cell antigens (TTCAs) is crucial for providing insights into their functional mechanisms and utilizing their potential in anticancer vaccines development. In this context, TTCAs are highly promising. Meanwhile, experimental technologies for discovering and characterizing new TTCAs are expensive and time-consuming. Although many machine learning (ML)-based models have been proposed for identifying new TTCAs, there is still a need to develop a robust model that can achieve higher rates of accuracy and precision.Results : In this study, we propose a new stacking ensemble learning-based framework, termed StackTTCA, for accurate and large-scale identification of TTCAs. Firstly, we constructed 156 different baseline models by using 12 different feature encoding schemes and 13 popular ML algorithms. Secondly, these baseline models were trained and employed to create a new probabilistic feature vector. Finally, the optimal probabilistic feature vector was determined based the feature selection strategy and then used for the construction of our stacked model. Comparative benchmarking experiments indicated that StackTTCA clearly outperformed several ML classifiers and the existing methods in terms of the independent test, with an accuracy of 0.932 and Matthew's correlation coefficient of 0.866.Conclusions : In summary, the proposed stacking ensemble learning-based framework of StackTTCA could help to precisely and rapidly identify true TTCAs for follow-up experimental verification. In addition, we developed an online web server () to maximize user convenience for high-throughput screening of novel TTCAs.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Stacking Ensemble Learning-Based Load Identification Considering Feature Fusion by Cyber-Physical Approach
    Li, Yanzhen
    Wang, Haixin
    Yang, Zihao
    Yang, Junyou
    Chen, Zhe
    IEEE SENSORS JOURNAL, 2023, 23 (06) : 5997 - 6007
  • [22] A cell - Based system for the high-throughput identification of collagen receptor antagonists
    Caserini, C.
    Giribaldi, M. G.
    Bovolenta, S.
    Scarabollolo, L.
    ATHEROSCLEROSIS SUPPLEMENTS, 2006, 7 (03) : 591 - 591
  • [23] Multiparameter Mechanical Phenotyping for Accurate Cell Identification Using High-Throughput Microfluidic Deformability Cytometry
    Zhou, Zheng
    Guo, Kefan
    Zhu, Shu
    Ni, Chen
    Ni, Zhonghua
    Xiang, Nan
    ANALYTICAL CHEMISTRY, 2024, 96 (25) : 10313 - 10321
  • [24] Deep learning-based high-throughput phenotyping accelerates gene discovery for stomatal traits
    Zhang, Wei
    Calla, Bernarda
    Thiruppathi, Dhineshkumar
    PLANT PHYSIOLOGY, 2021, 187 (03) : 1273 - 1275
  • [25] Deep Fish: Deep Learning-Based Classification of Zebrafish Deformation for High-Throughput Screening
    Ishaq, Omer
    Sadanandan, Sajith Kecheril
    Wahlby, Carolina
    SLAS DISCOVERY, 2017, 22 (01) : 102 - 107
  • [26] High-throughput mechanobiology: Force modulation of ensemble biochemical and cell-based assays
    dos Santos, Alia
    Fili, Natalia
    Pearson, David S.
    Hari-Gupta, Yukti
    Toseland, Christopher P.
    BIOPHYSICAL JOURNAL, 2021, 120 (04) : 631 - 641
  • [27] A paradigm for high-throughput screening of cell-selective surfaces coupling orthogonal gradients and machine learning-based cell recognition
    Xue, Yunfan
    Wu, Yuhui
    Wang, Cong
    Chen, Yifeng
    Wang, Xingwang
    Zhang, Peng
    Ji, Jian
    BIOACTIVE MATERIALS, 2023, 28 : 1 - 11
  • [28] An integrative machine learning model for the identification of tumor T-cell antigens
    Hassan, Mir Tanveerul
    Tayara, Hilal
    Chong, Kil To
    BIOSYSTEMS, 2024, 237
  • [29] Pretoria: An effective computational approach for accurate and high-throughput identification of CD8+t-cell epitopes of eukaryotic pathogens
    Charoenkwan, Phasit
    Schaduangrat, Nalini
    Pham, Nhat Truong
    Manavalan, Balachandran
    Shoombuatong, Watshara
    INTERNATIONAL JOURNAL OF BIOLOGICAL MACROMOLECULES, 2023, 238
  • [30] Robust and accurate performance anomaly detection and prediction for cloud applications: a novel ensemble learning-based framework
    Ruyue Xin
    Hongyun Liu
    Peng Chen
    Zhiming Zhao
    Journal of Cloud Computing, 12