Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression

被引:0
|
作者
Christoph Hafemeister
Rahul Satija
机构
[1] New York Genome Center,
[2] Center for Genomics and Systems Biology,undefined
[3] New York University,undefined
来源
关键词
Single-cell RNA-seq; Normalization;
D O I
暂无
中图分类号
学科分类号
摘要
Single-cell RNA-seq (scRNA-seq) data exhibits significant cell-to-cell variation due to technical factors, including the number of molecules detected in each cell, which can confound biological heterogeneity with technical effects. To address this, we present a modeling framework for the normalization and variance stabilization of molecular count data from scRNA-seq experiments. We propose that the Pearson residuals from “regularized negative binomial regression,” where cellular sequencing depth is utilized as a covariate in a generalized linear model, successfully remove the influence of technical characteristics from downstream analyses while preserving biological heterogeneity. Importantly, we show that an unconstrained negative binomial model may overfit scRNA-seq data, and overcome this by pooling information across genes with similar abundances to obtain stable parameter estimates. Our procedure omits the need for heuristic steps including pseudocount addition or log-transformation and improves common downstream analytical tasks such as variable gene selection, dimensional reduction, and differential expression. Our approach can be applied to any UMI-based scRNA-seq dataset and is freely available as part of the R package sctransform, with a direct interface to our single-cell toolkit Seurat.
引用
收藏
相关论文
共 50 条
  • [1] Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression
    Hafemeister, Christoph
    Satija, Rahul
    GENOME BIOLOGY, 2019, 20 (01)
  • [2] Resistant Fit Regression Normalization for Single-cell RNA-seq Data
    Kuang, Da
    Kim, Junhyong
    2020 IEEE 20TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE 2020), 2020, : 236 - 240
  • [3] SCnorm: robust normalization of single-cell RNA-seq data
    Bacher, Rhonda
    Chu, Li-Fang
    Leng, Ning
    Gasch, Audrey P.
    Thomson, James A.
    Stewart, Ron M.
    Newton, Michael
    Kendziorski, Christina
    NATURE METHODS, 2017, 14 (06) : 584 - +
  • [4] PsiNorm: a scalable normalization for single-cell RNA-seq data
    Borella, Matteo
    Martello, Graziano
    Risso, Davide
    Romualdi, Chiara
    BIOINFORMATICS, 2022, 38 (01) : 164 - 172
  • [5] SCnorm: robust normalization of single-cell RNA-seq data
    Rhonda Bacher
    Li-Fang Chu
    Ning Leng
    Audrey P Gasch
    James A Thomson
    Ron M Stewart
    Michael Newton
    Christina Kendziorski
    Nature Methods, 2017, 14 : 584 - 586
  • [6] Normalization Methods on Single-Cell RNA-seq Data: An Empirical Survey
    Lytal, Nicholas
    Ran, Di
    An, Lingling
    FRONTIERS IN GENETICS, 2020, 11
  • [7] Clustering Single-Cell RNA-Seq Data with Regularized Gaussian Graphical Model
    Liu, Zhenqiu
    GENES, 2021, 12 (02) : 1 - 12
  • [8] Analytic Pearson residuals for normalization of single-cell RNA-seq UMI data
    Lause, Jan
    Berens, Philipp
    Kobak, Dmitry
    GENOME BIOLOGY, 2021, 22 (01)
  • [9] Analytic Pearson residuals for normalization of single-cell RNA-seq UMI data
    Jan Lause
    Philipp Berens
    Dmitry Kobak
    Genome Biology, 22
  • [10] SCRABBLE: single-cell RNA-seq imputation constrained by bulk RNA-seq data
    Peng, Tao
    Zhu, Qin
    Yin, Penghang
    Tan, Kai
    GENOME BIOLOGY, 2019, 20 (1)