Statistics or biology: the zero-inflation controversy about scRNA-seq data

被引:107
|
作者
Jiang, Ruochen [1 ]
Sun, Tianyi [1 ]
Song, Dongyuan [2 ]
Li, Jingyi Jessica [1 ,3 ,4 ,5 ]
机构
[1] Univ Calif Los Angeles, Dept Stat, Los Angeles, CA 90095 USA
[2] Univ Calif Los Angeles, Bioinformat Interdept PhD Program, Los Angeles, CA 90095 USA
[3] Univ Calif Los Angeles, Dept Human Genet, Los Angeles, CA 90095 USA
[4] Univ Calif Los Angeles, Dept Computat Med, Los Angeles, CA 90095 USA
[5] Univ Calif Los Angeles, Dept Stat, Los Angeles, CA 90095 USA
基金
美国国家科学基金会;
关键词
CELL GENE-EXPRESSION; SINGLE-CELL; RNA-SEQ; FATE DECISIONS; DNA; RECONSTRUCTION; AMPLIFICATION; IMPUTATION; BINDING; MODEL;
D O I
10.1186/s13059-022-02601-5
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Researchers view vast zeros in single-cell RNA-seq data differently: some regard zeros as biological signals representing no or low gene expression, while others regard zeros as missing data to be corrected. To help address the controversy, here we discuss the sources of biological and non-biological zeros; introduce five mechanisms of adding non-biological zeros in computational benchmarking; evaluate the impacts of non-biological zeros on data analysis; benchmark three input data types: observed counts, imputed counts, and binarized counts; discuss the open questions regarding non-biological zeros; and advocate the importance of transparent analysis.
引用
收藏
页数:24
相关论文
共 50 条
  • [31] FRMC: a fast and robust method for the imputation of scRNA-seq data
    Wu, Honglong
    Wang, Xuebin
    Chu, Mengtian
    Xiang, Ruizhi
    Zhou, Ke
    RNA BIOLOGY, 2021, 18 : 172 - 181
  • [32] Deep embedded clustering with multiple objectives on scRNA-seq data
    Li, Xiangtao
    Zhang, Shixiong
    Wong, Ka-Chun
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (05)
  • [33] Detection of differentially abundant cell subpopulations in scRNA-seq data
    Zhao, Jun
    Jaffe, Ariel
    Li, Henry
    Lindenbaum, Ofir
    Sefik, Esen
    Jackson, Ruaidhri
    Cheng, Xiuyuan
    Flavell, Richard A.
    Kluger, Yuval
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2021, 118 (22)
  • [34] CellDepot: A Unified Repository for scRNA-seq Data and Visual Exploration
    Lin, Dongdong
    Chen, Yirui
    Negi, Soumya
    Cheng, Derrick
    Ouyang, Zhengyu
    Sexton, David
    Li, Kejie
    Zhang, Baohong
    JOURNAL OF MOLECULAR BIOLOGY, 2022, 434 (11)
  • [35] miRSCAPE - inferring miRNA expression from scRNA-seq data
    Olgun, Gulden
    Gopalan, Vishaka
    Hannenhalli, Sridhar
    ISCIENCE, 2022, 25 (09)
  • [36] scRNA-seq data analysis method to improve analysis performance
    Lu, Junru
    Sheng, Yuqi
    Qian, Weiheng
    Pan, Min
    Zhao, Xiangwei
    Ge, Qinyu
    IET NANOBIOTECHNOLOGY, 2023, 17 (03) : 246 - 256
  • [37] Deep zero-inflated negative binomial model and its application in scRNA-seq data integration
    Wei, Mingqiu
    Liu, Rongjie
    Wang, Yue Julia
    Huang, Chao
    SOUTHEASTCON 2023, 2023, : 901 - 905
  • [38] Interpretable Factors in scRNA-seq Data with Disentangled Generative Models
    Mao, Haiyi
    Broerman, Matthew J.
    Benos, Panayiotis, V
    2020 IEEE 20TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE 2020), 2020, : 85 - 88
  • [39] Exploring Hierarchical Structures of Cell Types in scRNA-seq Data
    Zhai, Haojie
    Ye, Yusen
    Hu, Yuxuan
    Wang, Lanying
    Gao, Lin
    BIOINFORMATICS RESEARCH AND APPLICATIONS, PT II, ISBRA 2024, 2024, 14955 : 1 - 13
  • [40] A Note on Tests for Zero-Inflation in Correlated Count Data
    Xiang, Liming
    Teo, Guo Shou
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2011, 40 (07) : 992 - 1005