SAFE-clustering: Single-cell Aggregated (from Ensemble) clustering for single-cell RNA-seq data

被引:76
|
作者
Yang, Yuchen [1 ]
Huh, Ruth [2 ]
Culpepper, Houston W. [1 ]
Lin, Yuan [3 ]
Love, Michael I. [1 ,2 ]
Li, Yun [1 ,2 ]
机构
[1] Univ N Carolina, Dept Genet, Chapel Hill, NC 27599 USA
[2] Univ N Carolina, Dept Biostat, Chapel Hill, NC 27599 USA
[3] Peking Univ, Sch Life Sci, Ctr Bioinformat, Beijing 100871, Peoples R China
基金
美国国家卫生研究院;
关键词
GENE-EXPRESSION; EMBRYOS;
D O I
10.1093/bioinformatics/bty793
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Accurately clustering cell types from a mass of heterogeneous cells is a crucial first step for the analysis of single-cell RNA-seq (scRNA-Seq) data. Although several methods have been recently developed, they utilize different characteristics of data and yield varying results in terms of both the number of clusters and actual cluster assignments. Results: Here, we present SAFE-clustering, single-cell aggregated (From Ensemble) clustering, a flexible, accurate and robust method for clustering scRNA-Seq data. SAFE-clustering takes as input, results from multiple clustering methods, to build one consensus solution. SAFE-clustering currently embeds four state-of-the-art methods, SC3, CIDR, Seurat and t-SNE + k-means; and ensembles solutions from these four methods using three hypergraph-based partitioning algorithms. Extensive assessment across 12 datasets with the number of clusters ranging from 3 to 14, and the number of single cells ranging from 49 to 32, 695 showcases the advantages of SAFE-clustering in terms of both cluster number (18.2-58.1% reduction in absolute deviation to the truth) and cluster assignment (on average 36.0% improvement, and up to 18.5% over the best of the four methods, measured by adjusted rand index). Moreover, SAFE-clustering is computationally efficient to accommodate large datasets, taking <10 min to process 28 733 cells. Availability and implementation: SAFEclustering, including source codes and tutorial, is freely available at https://github.com/yycunc/SAFEclustering.
引用
收藏
页码:1269 / 1277
页数:9
相关论文
共 50 条
  • [1] Analysis of Single-Cell RNA-seq Data by Clustering Approaches
    Zhu, Xiaoshu
    Li, Hong-Dong
    Guo, Lilu
    Wu, Fang-Xiang
    Wang, Jianxin
    [J]. CURRENT BIOINFORMATICS, 2019, 14 (04) : 314 - 322
  • [2] Deep Learning for Clustering Single-cell RNA-seq Data
    Zhu, Yuan
    Bai, Litai
    Ning, Zilin
    Fu, Wenfei
    Liu, Jie
    Jiang, Linfeng
    Fei, Shihuang
    Gong, Shiyun
    Lu, Lulu
    Deng, Minghua
    Yi, Ming
    [J]. CURRENT BIOINFORMATICS, 2024, 19 (03) : 193 - 210
  • [3] An active learning approach for clustering single-cell RNA-seq data
    Lin, Xiang
    Liu, Haoran
    Wei, Zhi
    Roy, Senjuti Basu
    Gao, Nan
    [J]. LABORATORY INVESTIGATION, 2022, 102 (03) : 227 - 235
  • [4] A Global Similarity Learning for Clustering of Single-Cell RNA-Seq Data
    Zhu, Xiaoshu
    Guo, Lilu
    Xu, Yunpei
    Li, Hong-Dong
    Liao, Xingyu
    Wu, Fang-Xiang
    Peng, Xiaoqing
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 261 - 266
  • [5] Impact of similarity metrics on single-cell RNA-seq data clustering
    Kim, Taiyun
    Chen, Irene Rui
    Lin, Yingxin
    Wang, Andy Yi-Yang
    Yang, Jean Yee Hwa
    Yang, Pengyi
    [J]. BRIEFINGS IN BIOINFORMATICS, 2019, 20 (06) : 2316 - 2326
  • [6] Single-cell RNA-seq data clustering by deep information fusion
    Ren, Liangrui
    Wang, Jun
    Li, Wei
    Guo, Maozu
    Yu, Guoxian
    [J]. BRIEFINGS IN FUNCTIONAL GENOMICS, 2024, 23 (02) : 128 - 137
  • [7] ECBN: Ensemble Clustering based on Bayesian Network inference for Single-cell RNA-seq Data
    Zhang, Dexin
    Zhu, Yuan
    [J]. PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 5884 - 5888
  • [8] A Hybrid Clustering Algorithm for Identifying Cell Types from Single-Cell RNA-Seq Data
    Zhu, Xiaoshu
    Li, Hong-Dong
    Xu, Yunpei
    Guo, Lilu
    Wu, Fang-Xiang
    Duan, Guihua
    Wang, Jianxin
    [J]. GENES, 2019, 10 (02)
  • [9] An interpretable framework for clustering single-cell RNA-Seq datasets
    Jesse M. Zhang
    Jue Fan
    H. Christina Fan
    David Rosenfeld
    David N. Tse
    [J]. BMC Bioinformatics, 19
  • [10] scMAE: a masked autoencoder for single-cell RNA-seq clustering
    Fang, Zhaoyu
    Zheng, Ruiqing
    Li, Min
    [J]. BIOINFORMATICS, 2024, 40 (01)