eSVD-DE: cohort-wide differential expression in single-cell RNA-seq data using exponential-family embeddings

被引:1
|
作者
Lin, Kevin Z. [1 ]
Qiu, Yixuan [2 ]
Roeder, Kathryn [3 ]
机构
[1] Univ Washington, Dept Biostat, Seattle, WA 98195 USA
[2] Shanghai Univ Finance & Econ, Sch Stat & Management, Shanghai, Peoples R China
[3] Carnegie Mellon Univ, Dept Stat & Data Sci, Pittsburgh, PA USA
关键词
Case-control subjects; Gamma-Poisson distribution; Matrix factorization; Multi-individual data; NORMALIZATION; GENOMICS;
D O I
10.1186/s12859-024-05724-7
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
BackgroundSingle-cell RNA-sequencing (scRNA) datasets are becoming increasingly popular in clinical and cohort studies, but there is a lack of methods to investigate differentially expressed (DE) genes among such datasets with numerous individuals. While numerous methods exist to find DE genes for scRNA data from limited individuals, differential-expression testing for large cohorts of case and control individuals using scRNA data poses unique challenges due to substantial effects of human variation, i.e., individual-level confounding covariates that are difficult to account for in the presence of sparsely-observed genes.ResultsWe develop the eSVD-DE, a matrix factorization that pools information across genes and removes confounding covariate effects, followed by a novel two-sample test in mean expression between case and control individuals. In general, differential testing after dimension reduction yields an inflation of Type-1 errors. However, we overcome this by testing for differences between the case and control individuals' posterior mean distributions via a hierarchical model. In previously published datasets of various biological systems, eSVD-DE has more accuracy and power compared to other DE methods typically repurposed for analyzing cohort-wide differential expression.ConclusionseSVD-DE proposes a novel and powerful way to test for DE genes among cohorts after performing a dimension reduction. Accurate identification of differential expression on the individual level, instead of the cell level, is important for linking scRNA-seq studies to our understanding of the human population.
引用
收藏
页数:30
相关论文
共 50 条
  • [1] eSVD-DE: cohort-wide differential expression in single-cell RNA-seq data using exponential-family embeddings
    Kevin Z. Lin
    Yixuan Qiu
    Kathryn Roeder
    BMC Bioinformatics, 25
  • [2] Exponential-Family Embedding With Application to Cell Developmental Trajectories for Single-Cell RNA-Seq Data
    Lin, Kevin Z.
    Lei, Jing
    Roeder, Kathryn
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2021, 116 (534) : 457 - 470
  • [3] Discussion of "Exponential-Family Embedding With Application to Cell Developmental Trajectories for Single-Cell RNA-Seq Data"
    Hu, Jian
    Li, Mingyao
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2021, 116 (534) : 475 - 477
  • [4] Differential expression of single-cell RNA-seq data using Tweedie models
    Mallick, Himel
    Chatterjee, Suvo
    Chowdhury, Shrabanti
    Chatterjee, Saptarshi
    Rahnavard, Ali
    Hicks, Stephanie C.
    STATISTICS IN MEDICINE, 2022, 41 (18) : 3492 - 3510
  • [5] DEsingle for detecting three types of differential expression in single-cell RNA-seq data
    Miao, Zhun
    Deng, Ke
    Wang, Xiaowo
    Zhang, Xuegong
    BIOINFORMATICS, 2018, 34 (18) : 3223 - 3224
  • [6] Differential expression analyses for single-cell RNA-Seq:old questions on new data
    Zhun Miao
    Xuegong Zhang
    Quantitative Biology, 2016, 4 (04) : 243 - 260+336
  • [7] IDEAS: individual level differential expression analysis for single-cell RNA-seq data
    Zhang, Mengqi
    Liu, Si
    Miao, Zhen
    Han, Fang
    Gottardo, Raphael
    Sun, Wei
    GENOME BIOLOGY, 2022, 23 (01)
  • [8] IDEAS: individual level differential expression analysis for single-cell RNA-seq data
    Mengqi Zhang
    Si Liu
    Zhen Miao
    Fang Han
    Raphael Gottardo
    Wei Sun
    Genome Biology, 23
  • [9] DECENT: differential expression with capture efficiency adjustmeNT for single-cell RNA-seq data
    Ye, Chengzhong
    Speed, Terence P.
    Salim, Agus
    BIOINFORMATICS, 2019, 35 (24) : 5155 - 5162
  • [10] ZIAQ: a quantile regression method for differential expression analysis of single-cell RNA-seq data
    Zhang, Wenfei
    Wei, Ying
    Zhang, Donghui
    Xu, Ethan Y.
    BIOINFORMATICS, 2020, 36 (10) : 3124 - 3130