Generative pretraining from large-scale transcriptomes for single-cell deciphering

被引:16
|
作者
Shen, Hongru [1 ]
Liu, Jilei [1 ]
Hu, Jiani [1 ]
Shen, Xilin [1 ]
Zhang, Chao [2 ]
Wu, Dan [1 ]
Feng, Mengyao [1 ]
Yang, Meng [1 ]
Li, Yang [1 ]
Yang, Yichen [1 ]
Wang, Wei [3 ]
Zhang, Qiang [4 ]
Yang, Jilong [2 ]
Chen, Kexin [3 ]
Li, Xiangchun [1 ]
机构
[1] Tianjin Med Univ, Tianjin Med Univ Canc Inst & Hosp, Tianjin Canc Inst, Tianjins Clin Res Ctr Canc,Natl Clin Res Ctr Canc, Tianjin, Peoples R China
[2] Tianjin Med Univ, Tianjin Med Univ Canc Inst & Hosp, Dept Bone & Soft Tissue Tumor, Tianjins Clin Res Ctr Canc,Natl Clin Res Ctr Canc, Tianjin, Peoples R China
[3] Tianjin Med Univ, Tianjin Med Univ Canc Inst & Hosp, Dept Epidemiol & Biostat, Natl Clin Res Ctr Canc,Key Lab Mol Canc Epidemiol, Tianjin, Peoples R China
[4] Tianjin Med Univ, Tianjin Med Univ Canc Inst & Hosp, Tianjins Clin Res Ctr Canc, Dept Maxillofacial & Otorhinolaryngol Oncol,Natl C, Tianjin, Peoples R China
基金
中国国家自然科学基金;
关键词
EXPRESSION; TISSUES;
D O I
10.1016/j.isci.2023.106536
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Exponential accumulation of single-cell transcriptomes poses great challenge for efficient assimilation. Here, we present an approach entitled generative pretrain-ing from transcriptomes (tGPT) for learning feature representation of transcrip-tomes. tGPT is conceptually simple in that it autoregressive models the ranking of a gene in the context of its preceding neighbors. We developed tGPT with 22.3 million single-cell transcriptomes and used four single-cell datasets to eval-utate its performance on single-cell analysis tasks. In addition, we examine its ap-plications on bulk tissues. The single-cell clusters and cell lineage trajectories derived from tGPT are highly aligned with known cell labels and states. The feature patterns of tumor bulk tissues learned by tGPT are associated with a wide range of genomic alteration events, prognosis, and treatment outcome of immunotherapy. tGPT represents a new analytical paradigm for integrating and deciphering massive amounts of transcriptome data and it will facilitate the inter-pretation and clinical translation of single-cell transcriptomes.
引用
收藏
页数:20
相关论文
共 50 条
  • [31] TECHNIQUE Single-cell transcriptomes in space
    Koch, Linda
    NATURE REVIEWS GENETICS, 2018, 19 (02) : 64 - 65
  • [32] Single-cell gene regulation network inference by large-scale data integration
    Dong, Xin
    Tang, Ke
    Xu, Yunfan
    Wei, Hailin
    Han, Tong
    Wang, Chenfei
    NUCLEIC ACIDS RESEARCH, 2022, 50 (21) : E126
  • [33] Development of a single-cell array for large-scale DNA fluorescence in situ hybridization
    Liu, Yingru
    Kirkland, Brett
    Shirley, James
    Wang, Zhibin
    Zhang, Peipei
    Stembridge, Jacquelyn
    Wong, Wilson
    Takebayashi, Shin-ichiro
    Gilbert, David M.
    Lenhert, Steven
    Guan, Jingjiao
    LAB ON A CHIP, 2013, 13 (07) : 1316 - 1324
  • [34] scMultiGAN: cell-specific imputation for single-cell transcriptomes with multiple deep generative adversarial networks
    Wang, Tao
    Zhao, Hui
    Xu, Yungang
    Wang, Yongtian
    Shang, Xuequn
    Peng, Jiajie
    Xiao, Bing
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (06)
  • [35] T cell fate and clonality inference from single-cell transcriptomes
    Stubbington, Michael J. T.
    Lonnberg, Tapio
    Proserpio, Valentina
    Clare, Simon
    Speak, Anneliese
    Dougan, Gordon
    Teichmann, Sarah A.
    NATURE METHODS, 2016, 13 (04) : 329 - 332
  • [36] T cell fate and clonality inference from single-cell transcriptomes
    Stubbington M.J.T.
    Lönnberg T.
    Proserpio V.
    Clare S.
    Speak A.O.
    Dougan G.
    Teichmann S.A.
    Nature Methods, 2016, 13 (4) : 329 - 332
  • [37] Large-scale analysis of the human and mouse transcriptomes
    Su, AI
    Cooke, MP
    Ching, KA
    Hakak, Y
    Walker, JR
    Wiltshire, T
    Orth, AP
    Vega, RG
    Sapinoso, LM
    Moqrich, A
    Patapoutian, A
    Hampton, GM
    Schultz, PG
    Hogenesch, JB
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (07) : 4465 - 4470
  • [38] Large-scale synthesis of multifunctional janus particles for single-cell in situ cytokine analysis
    Zhao, Peng
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2017, 254
  • [39] Multiplexing Methods for Simultaneous Large-Scale Transcriptomic Profiling of Samples at Single-Cell Resolution
    Cheng, Junyun
    Liao, Jie
    Shao, Xin
    Lu, Xiaoyan
    Fan, Xiaohui
    ADVANCED SCIENCE, 2021, 8 (17)
  • [40] Large-scale single-cell RNA-seq reveals a developmental hierarchy in oligodendrogliomas
    Tirosh, Itay
    Venteicher, Andrew S.
    Hebert, Christine
    Escalante, Leah
    Neftel, Cyril
    Nahed, Brian V.
    Curry, Will T.
    Cahill, Dan P.
    Frosch, Matthew P.
    Louis, David N.
    Regev, Aviv
    Suva, Mario L.
    JOURNAL OF NEUROPATHOLOGY AND EXPERIMENTAL NEUROLOGY, 2016, 75 (06): : 571 - 571