Single-Cell RNA-Seq Debiased Clustering via Batch Effect Disentanglement

被引:3
|
作者
Li, Yunfan [1 ]
Lin, Yijie [1 ]
Hu, Peng [1 ]
Peng, Dezhong [1 ]
Luo, Han [2 ]
Peng, Xi [1 ]
机构
[1] Sichuan Univ, Sch Comp Sci, Chengdu 610000, Peoples R China
[2] Sichuan Univ, West China Hosp, Chengdu 610000, Peoples R China
基金
中国国家自然科学基金;
关键词
Biological information theory; Clustering methods; Data models; Feature extraction; Deep learning; Data mining; Task analysis; Batch integration; clustering; single-cell RNA analysis; EXPRESSION;
D O I
10.1109/TNNLS.2023.3260003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A variety of single-cell RNA-seq (scRNA-seq) clustering methods has achieved great success in discovering cellular phenotypes. However, it remains challenging when the data confounds with batch effects brought by different experimental conditions or technologies. Namely, the data partitions would be biased toward these nonbiological factors. Meanwhile, the batch differences are not always much smaller than true biological variations, hindering the cooperation of batch integration and clustering methods. To overcome this challenge, we propose single-cell RNA-seq debiased clustering (SCDC), an end-to-end clustering method that is debiased toward batch effects by disentangling the biological and nonbiological information from scRNA-seq data during data partitioning. In six analyses, SCDC qualitatively and quantitatively outperforms both the state-of-the-art clustering and batch integration methods in handling scRNA-seq data with batch effects. Furthermore, SCDC clusters data with a linearly increasing running time with respect to cell numbers and a fixed graphics processing unit (GPU) memory consumption, making it scalable to large datasets. The code will be released on Github.
引用
收藏
页码:11371 / 11381
页数:11
相关论文
共 50 条
  • [21] Impact of similarity metrics on single-cell RNA-seq data clustering
    Kim, Taiyun
    Chen, Irene Rui
    Lin, Yingxin
    Wang, Andy Yi-Yang
    Yang, Jean Yee Hwa
    Yang, Pengyi
    BRIEFINGS IN BIOINFORMATICS, 2019, 20 (06) : 2316 - 2326
  • [22] Accurate feature selection improves single-cell RNA-seq cell clustering
    Su, Kenong
    Yu, Tianwei
    Wu, Hao
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (05)
  • [23] SMNN: batch effect correction for single-cell RNA-seq data via supervised mutual nearest neighbor detection
    Yang, Yuchen
    Li, Gang
    Qian, Huijun
    Wilhelmsen, Kirk C.
    Shen, Yin
    Li, Yun
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (03)
  • [24] CellMixS: quantifying and visualizing batch effects in single-cell RNA-seq data
    Luetge, Almut
    Zyprych-Walczak, Joanna
    Kunzmann, Urszula Brykczynska
    Crowell, Helena L.
    Calini, Daniela
    Malhotra, Dheeraj
    Soneson, Charlotte
    Robinson, Mark D.
    LIFE SCIENCE ALLIANCE, 2021, 4 (06)
  • [25] Multiobjective Deep Clustering and Its Applications in Single-cell RNA-seq Data
    Wang, Yunhe
    Bian, Chuang
    Wong, Ka-Chun
    Li, Xiangtao
    Yang, Shengxiang
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2022, 52 (08): : 5016 - 5027
  • [26] scDFC: A deep fusion clustering method for single-cell RNA-seq data
    Hu, Dayu
    Liang, Ke
    Zhou, Sihang
    Tu, Wenxuan
    Liu, Meng
    Liu, Xinwang
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (04)
  • [27] Secuer: Ultrafast, scalable and accurate clustering of single-cell RNA-seq data
    Wei, Nana
    Nie, Yating
    Liu, Lin
    Zheng, Xiaoqi
    Wu, Hua-Jun
    PLOS COMPUTATIONAL BIOLOGY, 2022, 18 (12)
  • [28] Publisher Correction: Challenges in unsupervised clustering of single-cell RNA-seq data
    Vladimir Yu Kiselev
    Tallulah S. Andrews
    Martin Hemberg
    Nature Reviews Genetics, 2019, 20 : 310 - 310
  • [29] Clustering Single-Cell RNA-Seq Data with Regularized Gaussian Graphical Model
    Liu, Zhenqiu
    GENES, 2021, 12 (02) : 1 - 12
  • [30] Correlation Imputation for Single-Cell RNA-seq
    Gan, Luqin
    Vinci, Giuseppe
    Allen, Genevera I.
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2022, 29 (05) : 465 - 482