Deconvolution from bulk gene expression by leveraging sample-wise and gene-wise similarities and single-cell RNA-Seq data

被引:1
|
作者
Wang, Chenqi [1 ]
Lin, Yifan [1 ]
Li, Shuchao [1 ]
Guan, Jinting [1 ,2 ,3 ]
机构
[1] Xiamen Univ, Dept Automat, Xiamen, Peoples R China
[2] Minist Educ, Key Lab Syst Control & Informat Proc, Shanghai, Peoples R China
[3] Xiamen Univ, Natl Inst Data Sci Hlth & Med, Xiamen, Peoples R China
来源
BMC GENOMICS | 2024年 / 25卷 / 01期
关键词
Deconvolution; Cell type abundance; Cell type-specific gene expression profile; Similarity matrix; Single-cell RNA-seq data; MOUSE; MAP; NORMALIZATION; HETEROGENEITY; DIVERSITY; ATLAS; STEM;
D O I
10.1186/s12864-024-10728-x
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
BackgroundThe widely adopted bulk RNA-seq measures the gene expression average of cells, masking cell type heterogeneity, which confounds downstream analyses. Therefore, identifying the cellular composition and cell type-specific gene expression profiles (GEPs) facilitates the study of the underlying mechanisms of various biological processes. Although single-cell RNA-seq focuses on cell type heterogeneity in gene expression, it requires specialized and expensive resources and currently is not practical for a large number of samples or a routine clinical setting. Recently, computational deconvolution methodologies have been developed, while many of them only estimate cell type composition or cell type-specific GEPs by requiring the other as input. The development of more accurate deconvolution methods to infer cell type abundance and cell type-specific GEPs is still essential.ResultsWe propose a new deconvolution algorithm, DSSC, which infers cell type-specific gene expression and cell type proportions of heterogeneous samples simultaneously by leveraging gene-gene and sample-sample similarities in bulk expression and single-cell RNA-seq data. Through comparisons with the other existing methods, we demonstrate that DSSC is effective in inferring both cell type proportions and cell type-specific GEPs across simulated pseudo-bulk data (including intra-dataset and inter-dataset simulations) and experimental bulk data (including mixture data and real experimental data). DSSC shows robustness to the change of marker gene number and sample size and also has cost and time efficiencies.ConclusionsDSSC provides a practical and promising alternative to the experimental techniques to characterize cellular composition and heterogeneity in the gene expression of heterogeneous samples.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] Bulk Tissue Gene Expression Deconvolution Using Single Cell RNA-Seq Data
    Wang, X.
    Li, M.
    Zhang, N.
    HUMAN HEREDITY, 2017, 83 (01) : 51 - 51
  • [2] scINRB: single-cell gene expression imputation with network regularization and bulk RNA-seq data
    Kang, Yue
    Zhang, Hongyu
    Guan, Jinting
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (03)
  • [3] A combined approach with gene-wise normalization improves the analysis of RNA-seq data in human breast cancer subtypes
    Li, Xiaohong
    Rouchka, Eric C.
    Brock, Guy N.
    Yan, Jun
    O'Toole, Timothy E.
    Tieri, David A.
    Cooper, Nigel G. F.
    PLOS ONE, 2018, 13 (08):
  • [4] Bayesian inference of gene expression states from single-cell RNA-seq data
    Breda, Jeremie
    Zavolan, Mihaela
    van Nimwegen, Erik
    NATURE BIOTECHNOLOGY, 2021, 39 (08) : 1008 - +
  • [5] Bayesian inference of gene expression states from single-cell RNA-seq data
    Jérémie Breda
    Mihaela Zavolan
    Erik van Nimwegen
    Nature Biotechnology, 2021, 39 : 1008 - 1016
  • [6] Improved moderation for gene-wise variance estimation in RNA-Seq via the exploitation of external information
    Patrick, Ellis
    Buckley, Michael
    Lin, David Ming
    Yang, Yee Hwa
    BMC GENOMICS, 2013, 14
  • [7] Improved moderation for gene-wise variance estimation in RNA-Seq via the exploitation of external information
    Ellis Patrick
    Michael Buckley
    David Ming Lin
    Yee Hwa Yang
    BMC Genomics, 14 (Suppl 1)
  • [8] Sample-Wise and Gene-Wise Comparisons Confirm a Greater Similarity of RNA and Protein Expression Data at the Level of Molecular Pathways and Suggest an Approach for the Data Quality Check in High-Throughput Expression Databases
    Raevskiy, Mikhail
    Sorokin, Maxim
    Emelianova, Aleksandra
    Zakharova, Galina
    Poddubskaya, Elena
    Zolotovskaia, Marianna
    Buzdin, Anton
    BIOCHEMISTRY-MOSCOW, 2024, 89 (04) : 737 - 746
  • [9] A probabilistic gene expression barcode for annotation of cell types from single-cell RNA-seq data
    Grabski, Isabella N.
    Irizarry, Rafael A.
    BIOSTATISTICS, 2022, 23 (04) : 1150 - 1164
  • [10] scENT for Revealing Gene Clusters From Single-Cell RNA-Seq Data
    Rao, Fan
    Chen, Minghan
    Yang, Defu
    Morrell, Bess
    Song, Qianqian
    Zhu, Wentao
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2023, 20 (03) : 2266 - 2277