Consensus clustering of single-cell RNA-seq data by enhancing network affinity

被引:29
|
作者
Cui, Yaxuan [1 ]
Zhang, Shaoqiang [1 ]
Liang, Ying [1 ]
Wang, Xiangyun [1 ]
Ferraro, Thomas N. [2 ]
Chen, Yong [3 ]
机构
[1] Tianjin Normal Univ, Coll Comp & Informat Engn, Tianjin 300387, Peoples R China
[2] CMSRU, Dept Biomed Sci, Camden, NJ USA
[3] Rowan Univ, Dept Mol & Cellular Biosci, Camden, NJ 08028 USA
基金
美国国家科学基金会;
关键词
single-cell RNA-seq; clustering algorithm; bioinformatics; cell typing; GENE-EXPRESSION; HETEROGENEITY; EMBRYOS; STATES; ATLAS; FATE;
D O I
10.1093/bib/bbab236
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Elucidation of cell subpopulations at high resolution is a key and challenging goal of single-cell ribonucleic acid (RNA) sequencing (scRNA-seq) data analysis. Although unsupervised clustering methods have been proposed for de novo identification of cell populations, their performance and robustness suffer from the high variability, low capture efficiency and high dropout rates which are characteristic of scRNA-seq experiments. Here, we present a novel unsupervised method for Single-cell Clustering by Enhancing Network Affinity (SCENA), which mainly employed three strategies: selecting multiple gene sets, enhancing local affinity among cells and clustering of consensus matrices. Large-scale validations on 13 real scRNA-seq datasets show that SCENA has high accuracy in detecting cell populations and is robust against dropout noise. When we applied SCENA to large-scale scRNA-seq data of mouse brain cells, known cell types were successfully detected, and novel cell types of interneurons were identified with differential expression of gamma-aminobutyric acid receptor subunits and transporters. SCENA is equipped with CPU+GPU (Central Processing Units+Graphics Processing Units) heterogeneous parallel computing to achieve high running speed. The high performance and running speed of SCENA combine into a new and efficient platform for biological discoveries in clustering analysis of large and diverse scRNA-seq datasets.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] scSemiAAE: a semi-supervised clustering model for single-cell RNA-seq data
    Zile Wang
    Haiyun Wang
    Jianping Zhao
    Chunhou Zheng
    BMC Bioinformatics, 24
  • [42] Impact of data preprocessing on cell-type clustering based on single-cell RNA-seq data
    Chunxiang Wang
    Xin Gao
    Juntao Liu
    BMC Bioinformatics, 21
  • [43] Impact of data preprocessing on cell-type clustering based on single-cell RNA-seq data
    Wang, Chunxiang
    Gao, Xin
    Liu, Juntao
    BMC BIOINFORMATICS, 2020, 21 (01)
  • [44] ccImpute: an accurate and scalable consensus clustering based algorithm to impute dropout events in the single-cell RNA-seq data
    Marcin Malec
    Hasan Kurban
    Mehmet Dalkilic
    BMC Bioinformatics, 23
  • [45] CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data
    Lin, Peijie
    Troup, Michael
    Ho, Joshua W. K.
    GENOME BIOLOGY, 2017, 18
  • [46] FlowGrid enables fast clustering of very large single-cell RNA-seq data
    Fang, Xiunan
    Ho, Joshua W. K.
    BIOINFORMATICS, 2022, 38 (01) : 282 - 283
  • [47] CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data
    Peijie Lin
    Michael Troup
    Joshua W. K. Ho
    Genome Biology, 18
  • [48] scFseCluster: a feature selection-enhanced clustering for single-cell RNA-seq data
    Wang, Zongqin
    Xie, Xiaojun
    Liu, Shouyang
    Ji, Zhiwei
    LIFE SCIENCE ALLIANCE, 2023, 6 (12)
  • [49] scSemiAAE: a semi-supervised clustering model for single-cell RNA-seq data
    Wang, Zile
    Wang, Haiyun
    Zhao, Jianping
    Zheng, Chunhou
    BMC BIOINFORMATICS, 2023, 24 (01)
  • [50] A deep matrix factorization based approach for single-cell RNA-seq data clustering
    Liang, Zhenlan
    Zheng, Ruiqing
    Chen, Siqi
    Yan, Xuhua
    Li, Min
    METHODS, 2022, 205 : 114 - 122