Generative pretraining from large-scale transcriptomes for single-cell deciphering

被引:9
|
作者
Shen, Hongru [1 ]
Liu, Jilei [1 ]
Hu, Jiani [1 ]
Shen, Xilin [1 ]
Zhang, Chao [2 ]
Wu, Dan [1 ]
Feng, Mengyao [1 ]
Yang, Meng [1 ]
Li, Yang [1 ]
Yang, Yichen [1 ]
Wang, Wei [3 ]
Zhang, Qiang [4 ]
Yang, Jilong [2 ]
Chen, Kexin [3 ]
Li, Xiangchun [1 ]
机构
[1] Tianjin Med Univ, Tianjin Med Univ Canc Inst & Hosp, Tianjin Canc Inst, Tianjins Clin Res Ctr Canc,Natl Clin Res Ctr Canc, Tianjin, Peoples R China
[2] Tianjin Med Univ, Tianjin Med Univ Canc Inst & Hosp, Dept Bone & Soft Tissue Tumor, Tianjins Clin Res Ctr Canc,Natl Clin Res Ctr Canc, Tianjin, Peoples R China
[3] Tianjin Med Univ, Tianjin Med Univ Canc Inst & Hosp, Dept Epidemiol & Biostat, Natl Clin Res Ctr Canc,Key Lab Mol Canc Epidemiol, Tianjin, Peoples R China
[4] Tianjin Med Univ, Tianjin Med Univ Canc Inst & Hosp, Tianjins Clin Res Ctr Canc, Dept Maxillofacial & Otorhinolaryngol Oncol,Natl C, Tianjin, Peoples R China
基金
中国国家自然科学基金;
关键词
EXPRESSION; TISSUES;
D O I
10.1016/j.isci.2023.106536
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Exponential accumulation of single-cell transcriptomes poses great challenge for efficient assimilation. Here, we present an approach entitled generative pretrain-ing from transcriptomes (tGPT) for learning feature representation of transcrip-tomes. tGPT is conceptually simple in that it autoregressive models the ranking of a gene in the context of its preceding neighbors. We developed tGPT with 22.3 million single-cell transcriptomes and used four single-cell datasets to eval-utate its performance on single-cell analysis tasks. In addition, we examine its ap-plications on bulk tissues. The single-cell clusters and cell lineage trajectories derived from tGPT are highly aligned with known cell labels and states. The feature patterns of tumor bulk tissues learned by tGPT are associated with a wide range of genomic alteration events, prognosis, and treatment outcome of immunotherapy. tGPT represents a new analytical paradigm for integrating and deciphering massive amounts of transcriptome data and it will facilitate the inter-pretation and clinical translation of single-cell transcriptomes.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] Deciphering Developmental Processes from Single-Cell Transcriptomes
    Robson, Paul
    [J]. DEVELOPMENTAL CELL, 2014, 29 (03) : 260 - 261
  • [2] Deep-learning methods for unveiling large-scale single-cell transcriptomes
    Xilin Shen
    Xiangchun Li
    [J]. Cancer Biology & Medicine, 2023, 20 (12) : 972 - 980
  • [3] Deep-learning methods for unveiling large-scale single-cell transcriptomes
    Shen, Xilin
    Li, Xiangchun
    [J]. CANCER BIOLOGY & MEDICINE, 2023, 20 (12) : 972 - 980
  • [4] Scalable batch-correction approach for integrating large-scale single-cell transcriptomes
    Shen, Xilin
    Shen, Hongru
    Wu, Dan
    Feng, Mengyao
    Hu, Jiani
    Liu, Jilei
    Yang, Yichen
    Yang, Meng
    Li, Yang
    Shi, Lei
    Chen, Kexin
    Li, Xiangchun
    [J]. BRIEFINGS IN BIOINFORMATICS, 2022, 23 (05)
  • [5] A universal approach for integrating super large-scale single-cell transcriptomes by exploring gene rankings
    Shen, Hongru
    Shen, Xilin
    Feng, Mengyao
    Wu, Dan
    Zhang, Chao
    Yang, Yichen
    Yang, Meng
    Hu, Jiani
    Liu, Jilei
    Wang, Wei
    Li, Yang
    Zhang, Qiang
    Yang, Jilong
    Chen, Kexin
    Li, Xiangchun
    [J]. BRIEFINGS IN BIOINFORMATICS, 2022, 23 (02)
  • [6] siVAE: interpretable deep generative models for single-cell transcriptomes
    Choi, Yongin
    Li, Ruoxin
    Quon, Gerald
    [J]. GENOME BIOLOGY, 2023, 24 (01)
  • [7] siVAE: interpretable deep generative models for single-cell transcriptomes
    Yongin Choi
    Ruoxin Li
    Gerald Quon
    [J]. Genome Biology, 24
  • [8] Immunology Driven by Large-Scale Single-Cell Sequencing
    Gomes, Tomas
    Teichmann, Sarah A.
    Talavera-Lopez, Carlos
    [J]. TRENDS IN IMMUNOLOGY, 2019, 40 (11) : 1011 - 1021
  • [9] Large-scale foundation model on single-cell transcriptomics
    Hao, Minsheng
    Gong, Jing
    Zeng, Xin
    Liu, Chiming
    Guo, Yucheng
    Cheng, Xingyi
    Wang, Taifeng
    Ma, Jianzhu
    Zhang, Xuegong
    Song, Le
    [J]. NATURE METHODS, 2024, 21 (08) : 1481 - 1491
  • [10] Large-scale reconstruction of cell lineages using single-cell readout of transcriptomes and CRISPR–Cas9 barcodes by scGESTALT
    Bushra Raj
    James A. Gagnon
    Alexander F. Schier
    [J]. Nature Protocols, 2018, 13 : 2685 - 2713