Distribution-Agnostic Deep Learning Enables Accurate Single-Cell Data Recovery and Transcriptional Regulation Interpretation

被引:3
|
作者
Su, Yanchi [1 ]
Yu, Zhuohan [1 ]
Yang, Yuning [2 ]
Wong, Ka-Chun [3 ]
Li, Xiangtao [1 ]
机构
[1] Jilin Univ, Sch Artificial Intelligence, Changchun 130012, Peoples R China
[2] Univ Toronto, Donnelly Ctr Cellular & Biomol Res, Toronto, ON M5S 3E1, Canada
[3] City Univ Hong Kong, Dept Comp Sci, Hong Kong 999077, Peoples R China
基金
中国国家自然科学基金;
关键词
imputation; optimal transport; single-cell RNA sequencing; DIFFERENTIATION; EXPRESSION; INDUCTION; DIVERSITY; DISTINCT; IMMUNE; OXYGEN; HEART; ATLAS; FATE;
D O I
10.1002/advs.202307280
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Single-cell RNA sequencing (scRNA-seq) is a robust method for studying gene expression at the single-cell level, but accurately quantifying genetic material is often hindered by limited mRNA capture, resulting in many missing expression values. Existing imputation methods rely on strict data assumptions, limiting their broader application, and lack reliable supervision, leading to biased signal recovery. To address these challenges, authors developed Bis, a distribution-agnostic deep learning model for accurately recovering missing sing-cell gene expression from multiple platforms. Bis is an optimal transport-based autoencoder model that can capture the intricate distribution of scRNA-seq data while addressing the characteristic sparsity by regularizing the cellular embedding space. Additionally, they propose a module using bulk RNA-seq data to guide reconstruction and ensure expression consistency. Experimental results show Bis outperforms other models across simulated and real datasets, showcasing superiority in various downstream analyses including batch effect removal, clustering, differential expression analysis, and trajectory inference. Moreover, Bis successfully restores gene expression levels in rare cell subsets in a tumor-matched peripheral blood dataset, revealing developmental characteristics of cytokine-induced natural killer cells within a head and neck squamous cell carcinoma microenvironment. The accurate measurement of genetic material encounters challenges due to limited intracellular mRNA capture, leading to many missing expression values. A distribution-agnostic deep learning model, informed by external cues from bulk RNA-seq data, is developed to address this issue. This model precisely reconstructs gene expression patterns, offering valuable insights into the developmental maturation mechanisms of cytokine-induced NK cells. image
引用
收藏
页数:27
相关论文
共 50 条
  • [31] A Fusion Learning Model Based on Deep Learning for Single-Cell RNA Sequencing Data Clustering
    Qiao, Tian-Jing
    Li, Feng
    Yuan, Sha-Sha
    Dai, Ling-Yun
    Wang, Juan
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2024, 31 (06) : 576 - 588
  • [32] MulCNN: An efficient and accurate deep learning method based on gene embedding for cell type identification in single-cell RNA-seq data
    Jiao, Linfang
    Ren, Yongqi
    Wang, Lulu
    Gao, Changnan
    Wang, Shuang
    Song, Tao
    FRONTIERS IN GENETICS, 2023, 14
  • [33] A joint deep learning model enables simultaneous batch effect correction, denoising, and clustering in single-cell transcriptomics
    Lakkis, Justin
    Wang, David
    Zhang, Yuanchao
    Hu, Gang
    Wang, Kui
    Pan, Huize
    Ungar, Lyle
    Reilly, Muredach P.
    Li, Xiangjie
    Li, Mingyao
    GENOME RESEARCH, 2021, 31 (10) : 1753 - 1766
  • [34] Clustering of single-cell multi-omics data with a multimodal deep learning method
    Xiang Lin
    Tian Tian
    Zhi Wei
    Hakon Hakonarson
    Nature Communications, 13
  • [35] Emerging deep learning methods for single-cell RNA-seq data analysis
    Zheng, Jie
    Wang, Ke
    QUANTITATIVE BIOLOGY, 2019, 7 (04) : 247 - 254
  • [36] Deep learning explains the biology of branched glycans from single-cell sequencing data
    Qin, Rui
    Mahal, Lara K.
    Bojar, Daniel
    ISCIENCE, 2022, 25 (10)
  • [37] Multimodal deep learning approaches for single-cell multi-omics data integration
    Athaya, Tasbiraha
    Ripan, Rony Chowdhury
    Li, Xiaoman
    Hu, Haiyan
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (05)
  • [38] Emerging deep learning methods for single-cell RNA-seq data analysis
    Jie Zheng
    Ke Wang
    Quantitative Biology, 2019, 7 (04) : 247 - 254
  • [39] Clustering of single-cell multi-omics data with a multimodal deep learning method
    Lin, Xiang
    Tian, Tian
    Wei, Zhi
    Hakonarson, Hakon
    NATURE COMMUNICATIONS, 2022, 13 (01)
  • [40] Accurate cell type annotation for single-cell chromatin accessibility data via contrastive learning and reference guidance
    Li, Siyu
    Tang, Songming
    Wang, Yunchang
    Li, Sijie
    Jia, Yuhang
    Chen, Shengquan
    QUANTITATIVE BIOLOGY, 2024, 12 (01) : 85 - 99