Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression

被引:0
|
作者
Christoph Hafemeister
Rahul Satija
机构
[1] New York Genome Center,
[2] Center for Genomics and Systems Biology,undefined
[3] New York University,undefined
来源
关键词
Single-cell RNA-seq; Normalization;
D O I
暂无
中图分类号
学科分类号
摘要
Single-cell RNA-seq (scRNA-seq) data exhibits significant cell-to-cell variation due to technical factors, including the number of molecules detected in each cell, which can confound biological heterogeneity with technical effects. To address this, we present a modeling framework for the normalization and variance stabilization of molecular count data from scRNA-seq experiments. We propose that the Pearson residuals from “regularized negative binomial regression,” where cellular sequencing depth is utilized as a covariate in a generalized linear model, successfully remove the influence of technical characteristics from downstream analyses while preserving biological heterogeneity. Importantly, we show that an unconstrained negative binomial model may overfit scRNA-seq data, and overcome this by pooling information across genes with similar abundances to obtain stable parameter estimates. Our procedure omits the need for heuristic steps including pseudocount addition or log-transformation and improves common downstream analytical tasks such as variable gene selection, dimensional reduction, and differential expression. Our approach can be applied to any UMI-based scRNA-seq dataset and is freely available as part of the R package sctransform, with a direct interface to our single-cell toolkit Seurat.
引用
收藏
相关论文
共 50 条
  • [31] Single-cell RNA-seq data augmentation using generative Fourier transformer
    Nouri, Nima
    COMMUNICATIONS BIOLOGY, 2025, 8 (01)
  • [32] Negative binomial additive model for RNA-Seq data analysis
    Xu Ren
    Pei-Fen Kuan
    BMC Bioinformatics, 21
  • [33] Negative binomial additive model for RNA-Seq data analysis
    Ren Xu
    Kuan Pei-Fen
    BMC BIOINFORMATICS, 2020, 21 (01)
  • [34] Ensemble Adaptive Total Variation Graph Regularized NMF for Single-cell RNA-Seq Data Analysis
    Ya-Li Zhu
    Ying-Lian Gao
    Jin-Xing Liu
    Zhu, Rong
    Xiang-Zhen Kong
    CURRENT BIOINFORMATICS, 2021, 16 (08) : 1014 - 1023
  • [35] Quantifying the clusterness and trajectoriness of single-cell RNA-seq data
    Lim, Hong Seo
    Qiu, Peng
    PLOS COMPUTATIONAL BIOLOGY, 2024, 20 (02)
  • [36] Evaluating imputation methods for single-cell RNA-seq data
    Cheng, Yi
    Ma, Xiuli
    Yuan, Lang
    Sun, Zhaoguo
    Wang, Pingzhang
    BMC BIOINFORMATICS, 2023, 24 (01)
  • [37] Challenges in unsupervised clustering of single-cell RNA-seq data
    Kiselev, Vladimir Yu
    Andrews, Tallulah S.
    Hemberg, Martin
    NATURE REVIEWS GENETICS, 2019, 20 (05) : 273 - 282
  • [38] Testing for Phylogenetic Signal in Single-Cell RNA-Seq Data
    Moravec, Jiri C.
    Lanfear, Robert
    Spector, David L.
    Diermeier, Sarah D.
    Gavryushkin, Alex
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2023, 30 (04) : 518 - 537
  • [39] Locality Sensitive Imputation for Single-Cell RNA-Seq Data
    Moussa, Marmar
    Mandoiu, Ion I.
    BIOINFORMATICS RESEARCH AND APPLICATIONS, ISBRA 2018, 2018, 10847 : 347 - 360
  • [40] Supervised Adversarial Alignment of Single-Cell RNA-seq Data
    Ge, Songwei
    Wang, Haohan
    Alavi, Amir
    Xing, Eric
    Bar-Joseph, Ziv
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2021, 28 (05) : 501 - 513