Accurate feature selection improves single-cell RNA-seq cell clustering

被引:31
|
作者
Su, Kenong [1 ]
Yu, Tianwei [2 ]
Wu, Hao [3 ]
机构
[1] Emory Univ, Dept Comp Sci, Atlanta, GA 30322 USA
[2] Chinese Univ Hong Kong, Sch Data Sci, Shenzhen, Peoples R China
[3] Emory Univ, Dept Biostat & Bioinformat, Atlanta, GA 30322 USA
关键词
single-cell RNA sequencing; cell clustering; feature selection; NORMALIZATION;
D O I
10.1093/bib/bbab034
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Cell clustering is one of the most important and commonly performed tasks in single-cell RNA sequencing (scRNA-seq) data analysis. An important step in cell clustering is to select a subset of genes (referred to as 'features'), whose expression patterns will then be used for downstream clustering. A good set of features should include the ones that distinguish different cell types, and the quality of such set could have a significant impact on the clustering accuracy. All existing scRNA-seq clustering tools include a feature selection step relying on some simple unsupervised feature selection methods, mostly based on the statistical moments of gene-wise expression distributions. In this work, we carefully evaluate the impact of feature selection on cell clustering accuracy. In addition, we develop a feature selection algorithm named FEAture SelecTion (FEAST), which provides more representative features. We apply the method on 12 public scRNA-seq datasets and demonstrate that using features selected by FEAST with existing clustering tools significantly improve the clustering accuracy.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] scFseCluster: a feature selection-enhanced clustering for single-cell RNA-seq data
    Wang, Zongqin
    Xie, Xiaojun
    Liu, Shouyang
    Ji, Zhiwei
    [J]. LIFE SCIENCE ALLIANCE, 2023, 6 (12)
  • [2] FEATS: feature selection-based clustering of single-cell RNA-seq data
    Vans, Edwin
    Patil, Ashwini
    Sharma, Alok
    [J]. BRIEFINGS IN BIOINFORMATICS, 2021, 22 (04)
  • [3] Clustering of Small-Sample Single-Cell RNA-Seq Data via Feature Clustering and Selection
    Vans, Edwin
    Sharma, Alok
    Patil, Ashwini
    Shigemizu, Daichi
    Tsunoda, Tatsuhiko
    [J]. PRICAI 2019: TRENDS IN ARTIFICIAL INTELLIGENCE, PT III, 2019, 11672 : 445 - 456
  • [4] Secuer: Ultrafast, scalable and accurate clustering of single-cell RNA-seq data
    Wei, Nana
    Nie, Yating
    Liu, Lin
    Zheng, Xiaoqi
    Wu, Hua-Jun
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2022, 18 (12)
  • [5] Feature Selection in Single-Cell RNA-seq Data via a Genetic Algorithm
    Chatzilygeroudis, Konstantinos I.
    Vrahatis, Aristidis G.
    Tasoulis, Sotiris K.
    Vrahatis, Michael N.
    [J]. LEARNING AND INTELLIGENT OPTIMIZATION, LION 15, 2021, 12931 : 66 - 79
  • [6] Comparison of Gene Selection Methods for Clustering Single-cell RNA-seq Data
    Zhu, Xiaoshu
    Wang, Jianxin
    Li, Rongruan
    Peng, Xiaoqing
    [J]. CURRENT BIOINFORMATICS, 2023, 18 (01) : 1 - 11
  • [7] CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data
    Lin, Peijie
    Troup, Michael
    Ho, Joshua W. K.
    [J]. GENOME BIOLOGY, 2017, 18
  • [8] CIDR: Ultrafast and accurate clustering through imputation for single-cell RNA-seq data
    Peijie Lin
    Michael Troup
    Joshua W. K. Ho
    [J]. Genome Biology, 18
  • [9] Analysis of Single-Cell RNA-seq Data by Clustering Approaches
    Zhu, Xiaoshu
    Li, Hong-Dong
    Guo, Lilu
    Wu, Fang-Xiang
    Wang, Jianxin
    [J]. CURRENT BIOINFORMATICS, 2019, 14 (04) : 314 - 322
  • [10] An interpretable framework for clustering single-cell RNA-Seq datasets
    Jesse M. Zhang
    Jue Fan
    H. Christina Fan
    David Rosenfeld
    David N. Tse
    [J]. BMC Bioinformatics, 19