Clustering single-cell RNA-seq data with a model-based deep learning approach

被引:193
|
作者
Tian, Tian [1 ]
Wan, Ji [2 ]
Song, Qi [2 ]
Wei, Zhi [1 ]
机构
[1] New Jersey Inst Technol, Dept Comp Sci, Newark, NJ 07102 USA
[2] CuraCloud Corp, Seattle, WA USA
关键词
Clustering algorithms - Cells - Cytology - Deep learning;
D O I
10.1038/s42256-019-0037-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Single-cell RNA sequencing (scRNA-seq) promises to provide higher resolution of cellular differences than bulk RNA sequencing. Clustering transcriptomes profiled by scRNA-seq has been routinely conducted to reveal cell heterogeneity and diversity. However, clustering analysis of scRNA-seq data remains a statistical and computational challenge, due to the pervasive dropout events obscuring the data matrix with prevailing 'false' zero count observations. Here, we have developed scDeepCluster, a single-cell model-based deep embedded clustering method, which simultaneously learns feature representation and clustering via explicit modelling of scRNA-seq data generation. Based on testing extensive simulated data and real datasets from four representative single-cell sequencing platforms, scDeepCluster outperformed state-of-the-art methods under various clustering performance metrics and exhibited improved scalability, with running time increasing linearly with sample size. Its accuracy and efficiency make scDeepCluster a promising algorithm for clustering large-scale scRNA-seq data. Clustering groups of cells in single-cell RNA sequencing datasets can produce high-resolution information for complex biological questions. However, it is statistically and computationally challenging due to the low RNA capture rate, which results in a high number of false zero count observations. A deep learning approach called scDeepCluster, which efficiently combines a model for explicitly characterizing missing values with clustering, shows high performance and improved scalability with a computing time increasing linearly with sample size.
引用
收藏
页码:191 / 198
页数:8
相关论文
共 50 条
  • [1] Clustering single-cell RNA-seq data with a model-based deep learning approach
    Tian Tian
    Ji Wan
    Qi Song
    Zhi Wei
    [J]. Nature Machine Intelligence, 2019, 1 : 191 - 198
  • [2] Deep Learning for Clustering Single-cell RNA-seq Data
    Zhu, Yuan
    Bai, Litai
    Ning, Zilin
    Fu, Wenfei
    Liu, Jie
    Jiang, Linfeng
    Fei, Shihuang
    Gong, Shiyun
    Lu, Lulu
    Deng, Minghua
    Yi, Ming
    [J]. CURRENT BIOINFORMATICS, 2024, 19 (03) : 193 - 210
  • [3] An active learning approach for clustering single-cell RNA-seq data
    Lin, Xiang
    Liu, Haoran
    Wei, Zhi
    Roy, Senjuti Basu
    Gao, Nan
    [J]. LABORATORY INVESTIGATION, 2022, 102 (03) : 227 - 235
  • [4] A deep matrix factorization based approach for single-cell RNA-seq data clustering
    Liang, Zhenlan
    Zheng, Ruiqing
    Chen, Siqi
    Yan, Xuhua
    Li, Min
    [J]. METHODS, 2022, 205 : 114 - 122
  • [5] Model-based deep embedding for constrained clustering analysis of single cell RNA-seq data
    Tian, Tian
    Zhang, Jie
    Lin, Xiang
    Wei, Zhi
    Hakonarson, Hakon
    [J]. NATURE COMMUNICATIONS, 2021, 12 (01)
  • [6] Model-based deep embedding for constrained clustering analysis of single cell RNA-seq data
    Tian Tian
    Jie Zhang
    Xiang Lin
    Zhi Wei
    Hakon Hakonarson
    [J]. Nature Communications, 12
  • [7] Deep single-cell RNA-seq data clustering with graph prototypical contrastive learning
    Lee, Junseok
    Kim, Sungwon
    Hyun, Dongmin
    Lee, Namkyeong
    Kim, Yejin
    Park, Chanyoung
    [J]. BIOINFORMATICS, 2023, 39 (06)
  • [8] Single-cell RNA-seq data clustering by deep information fusion
    Ren, Liangrui
    Wang, Jun
    Li, Wei
    Guo, Maozu
    Yu, Guoxian
    [J]. BRIEFINGS IN FUNCTIONAL GENOMICS, 2024, 23 (02) : 128 - 137
  • [9] Model-based clustering for RNA-seq data
    Si, Yaqing
    Liu, Peng
    Li, Pinghua
    Brutnell, Thomas P.
    [J]. BIOINFORMATICS, 2014, 30 (02) : 197 - 205
  • [10] A Global Similarity Learning for Clustering of Single-Cell RNA-Seq Data
    Zhu, Xiaoshu
    Guo, Lilu
    Xu, Yunpei
    Li, Hong-Dong
    Liao, Xingyu
    Wu, Fang-Xiang
    Peng, Xiaoqing
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 261 - 266