bayNorm: Bayesian gene expression recovery, imputation and normalization for single-cell RNA-sequencing data

被引:60
|
作者
Tang, Wenhao [1 ]
Bertaux, Francois [1 ,2 ,3 ,4 ]
Thomas, Philipp [1 ]
Stefanelli, Claire [1 ]
Saint, Malika [2 ,3 ]
Marguerat, Samuel [2 ,3 ]
Shahrezaei, Vahid [1 ]
机构
[1] Imperial Coll, Fac Nat Sci, Dept Math, London SW7 2AZ, England
[2] MRC London Inst Med Sci LMS, London W12 0NN, England
[3] Imperial Coll London, Fac Med, Inst Clin Sci, London W12 0NN, England
[4] Inst Pasteur, USR 3756, IP, CNRS, 28 Rue Docteur Roux, F-75015 Paris, France
基金
英国医学研究理事会; 英国工程与自然科学研究理事会;
关键词
BIOLOGY;
D O I
10.1093/bioinformatics/btz726
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Normalization of single-cell RNA-sequencing (scRNA-seq) data is a prerequisite to their interpretation. The marked technical variability, high amounts of missing observations and batch effect typical of scRNA-seq datasets make this task particularly challenging. There is a need for an efficient and unified approach for normalization, imputation and batch effect correction. Results: Here, we introduce bayNorm, a novel Bayesian approach for scaling and inference of scRNA-seq counts. The method's likelihood function follows a binomial model of mRNA capture, while priors are estimated from expression values across cells using an empirical Bayes approach. We first validate our assumptions by showing this model can reproduce different statistics observed in real scRNA-seq data. We demonstrate using publicly available scRNA-seq datasets and simulated expression data that bayNorm allows robust imputation of missing values generating realistic transcript distributions that match single molecule fluorescence in situ hybridization measurements. Moreover, by using priors informed by dataset structures, bayNorm improves accuracy and sensitivity of differential expression analysis and reduces batch effect compared with other existing methods. Altogether, bayNorm provides an efficient, integrated solution for global scaling normalization, imputation and true count recovery of gene expression measurements from scRNA-seq data.
引用
收藏
页码:1174 / 1181
页数:8
相关论文
共 50 条
  • [1] A systematic evaluation of single-cell RNA-sequencing imputation methods
    Hou, Wenpin
    Ji, Zhicheng
    Ji, Hongkai
    Hicks, Stephanie C.
    [J]. GENOME BIOLOGY, 2020, 21 (01)
  • [2] A systematic evaluation of single-cell RNA-sequencing imputation methods
    Wenpin Hou
    Zhicheng Ji
    Hongkai Ji
    Stephanie C. Hicks
    [J]. Genome Biology, 21
  • [3] Normalization by distributional resampling of high throughput single-cell RNA-sequencing data
    Brown, Jared
    Ni, Zijian
    Mohanty, Chitrasen
    Bacher, Rhonda
    Kendziorski, Christina
    [J]. BIOINFORMATICS, 2021, 37 (22) : 4123 - 4128
  • [4] Isoform-level gene expression patterns in single-cell RNA-sequencing data
    Trung Nghia Vu
    Wills, Quin F.
    Kalari, Krishna R.
    Niu, Nifang
    Wang, Liewei
    Pawitan, Yudi
    Rantalainen, Mattias
    [J]. BIOINFORMATICS, 2018, 34 (14) : 2392 - 2400
  • [5] Inferring the kinetics of stochastic gene expression from single-cell RNA-sequencing data
    Jong Kyoung Kim
    John C Marioni
    [J]. Genome Biology, 14
  • [6] Inferring the kinetics of stochastic gene expression from single-cell RNA-sequencing data
    Kim, Jong Kyoung
    Marioni, John C.
    [J]. GENOME BIOLOGY, 2013, 14 (01): : 1 - 12
  • [7] Demultiplexing of single-cell RNA-sequencing data using interindividual variation in gene expression
    Nassiri, Isar
    Kwok, Andrew J.
    Bhandari, Aneesha
    Bull, Katherine R.
    Garner, Lucy C.
    Klenerman, Paul
    Webber, Caleb
    Parkkinen, Laura
    Lee, Angela W.
    Wu, Yanxia
    Fairfax, Benjamin
    Knight, Julian C.
    Buck, David
    Piazza, Paolo
    [J]. BIOINFORMATICS ADVANCES, 2024, 4 (01):
  • [8] A HIERARCHICAL BAYESIAN MODEL FOR SINGLE-CELL CLUSTERING USING RNA-SEQUENCING DATA
    Liu, Yiyi
    Warren, Joshua L.
    Zhao, Hongyu
    [J]. ANNALS OF APPLIED STATISTICS, 2019, 13 (03): : 1733 - 1752
  • [9] SAVER: gene expression recovery for single-cell RNA sequencing
    Mo Huang
    Jingshu Wang
    Eduardo Torre
    Hannah Dueck
    Sydney Shaffer
    Roberto Bonasio
    John I. Murray
    Arjun Raj
    Mingyao Li
    Nancy R. Zhang
    [J]. Nature Methods, 2018, 15 : 539 - 542
  • [10] SAVER: gene expression recovery for single-cell RNA sequencing
    Huang, Mo
    Wang, Jingshu
    Torre, Eduardo
    Dueck, Hannah
    Shaffer, Sydney
    Bonasio, Roberto
    Murray, John I.
    Raj, Arjun
    Li, Mingyao
    Zhang, Nancy R.
    [J]. NATURE METHODS, 2018, 15 (07) : 539 - +