SAFE-clustering: Single-cell Aggregated (from Ensemble) clustering for single-cell RNA-seq data

被引:76
|
作者
Yang, Yuchen [1 ]
Huh, Ruth [2 ]
Culpepper, Houston W. [1 ]
Lin, Yuan [3 ]
Love, Michael I. [1 ,2 ]
Li, Yun [1 ,2 ]
机构
[1] Univ N Carolina, Dept Genet, Chapel Hill, NC 27599 USA
[2] Univ N Carolina, Dept Biostat, Chapel Hill, NC 27599 USA
[3] Peking Univ, Sch Life Sci, Ctr Bioinformat, Beijing 100871, Peoples R China
基金
美国国家卫生研究院;
关键词
GENE-EXPRESSION; EMBRYOS;
D O I
10.1093/bioinformatics/bty793
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Accurately clustering cell types from a mass of heterogeneous cells is a crucial first step for the analysis of single-cell RNA-seq (scRNA-Seq) data. Although several methods have been recently developed, they utilize different characteristics of data and yield varying results in terms of both the number of clusters and actual cluster assignments. Results: Here, we present SAFE-clustering, single-cell aggregated (From Ensemble) clustering, a flexible, accurate and robust method for clustering scRNA-Seq data. SAFE-clustering takes as input, results from multiple clustering methods, to build one consensus solution. SAFE-clustering currently embeds four state-of-the-art methods, SC3, CIDR, Seurat and t-SNE + k-means; and ensembles solutions from these four methods using three hypergraph-based partitioning algorithms. Extensive assessment across 12 datasets with the number of clusters ranging from 3 to 14, and the number of single cells ranging from 49 to 32, 695 showcases the advantages of SAFE-clustering in terms of both cluster number (18.2-58.1% reduction in absolute deviation to the truth) and cluster assignment (on average 36.0% improvement, and up to 18.5% over the best of the four methods, measured by adjusted rand index). Moreover, SAFE-clustering is computationally efficient to accommodate large datasets, taking <10 min to process 28 733 cells. Availability and implementation: SAFEclustering, including source codes and tutorial, is freely available at https://github.com/yycunc/SAFEclustering.
引用
收藏
页码:1269 / 1277
页数:9
相关论文
共 50 条
  • [21] scGAC: a graph attentional architecture for clustering single-cell RNA-seq data
    Cheng, Yi
    Ma, Xiuli
    [J]. BIOINFORMATICS, 2022, 38 (08) : 2187 - 2193
  • [22] Single-cell RNA-seq data clustering: A survey with performance comparison study
    Li, Ruiyi
    Guan, Jihong
    Zhou, Shuigeng
    [J]. JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2020, 18 (04)
  • [23] Clustering and visualization of single-cell RNA-seq data using path metrics
    Manousidaki, Andriana
    Little, Anna
    Xie, Yuying
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2024, 20 (05)
  • [24] Clustering single-cell RNA-seq data by rank constrained similarity learning
    Mei, Qinglin
    Li, Guojun
    Su, Zhengchang
    [J]. BIOINFORMATICS, 2021, 37 (19) : 3235 - 3242
  • [25] Comparison of Gene Selection Methods for Clustering Single-cell RNA-seq Data
    Zhu, Xiaoshu
    Wang, Jianxin
    Li, Rongruan
    Peng, Xiaoqing
    [J]. CURRENT BIOINFORMATICS, 2023, 18 (01) : 1 - 11
  • [26] SC3: Consensus clustering of single-cell RNA-seq data
    Kiselev V.Y.
    Kirschner K.
    Schaub M.T.
    Andrews T.
    Yiu A.
    Chandra T.
    Natarajan K.N.
    Reik W.
    Barahona M.
    Green A.R.
    Hemberg M.
    [J]. Nature Methods, 2017, 14 (5) : 483 - 486
  • [27] GRACE: A Graph-Based Cluster Ensemble Approach for Single-Cell RNA-Seq Data Clustering
    Guan, Jihong
    Li, Rui-Yi
    Wang, Jiasheng
    [J]. IEEE ACCESS, 2020, 8 : 166730 - 166741
  • [28] Accurate feature selection improves single-cell RNA-seq cell clustering
    Su, Kenong
    Yu, Tianwei
    Wu, Hao
    [J]. BRIEFINGS IN BIOINFORMATICS, 2021, 22 (05)
  • [29] Review of single-cell RNA-seq data clustering for cell-type identification and characterization
    Zhang, Shixiong
    Li, Xiangtao
    Lin, Jiecong
    Lin, Qiuzhen
    Wong, Ka-Chun
    [J]. RNA, 2023, 29 (05) : 517 - 530
  • [30] Dirichlet process mixture models for single-cell RNA-seq clustering
    Adossa, Nigatu A.
    Rytkonen, Kalle T.
    Elo, Laura L.
    [J]. BIOLOGY OPEN, 2022, 11 (04):