Supervised Adversarial Alignment of Single-Cell RNA-seq Data

被引:12
|
作者
Ge, Songwei [1 ]
Wang, Haohan [2 ]
Alavi, Amir [1 ]
Xing, Eric [2 ,3 ]
Bar-Joseph, Ziv [1 ,3 ]
机构
[1] Carnegie Mellon Univ, Computat Biol Dept, 5000 Forbes Ave, Pittsburgh, PA 15213 USA
[2] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
[3] Carnegie Mellon Univ, Machine Learning Dept, Pittsburgh, PA 15213 USA
基金
美国国家卫生研究院;
关键词
batch effect removal; data integration; dimensionality reduction; domain adversarial training; single-cell RNA-seq;
D O I
10.1089/cmb.2020.0439
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Dimensionality reduction is an important first step in the analysis of single-cell RNA-sequencing (scRNA-seq) data. In addition to enabling the visualization of the profiled cells, such representations are used by many downstream analyses methods ranging from pseudo-time reconstruction to clustering to alignment of scRNA-seq data from different experiments, platforms, and laboratories. Both supervised and unsupervised methods have been proposed to reduce the dimension of scRNA-seq. However, all methods to date are sensitive to batch effects. When batches correlate with cell types, as is often the case, their impact can lead to representations that are batch rather than cell-type specific. To overcome this, we developed a domain adversarial neural network model for learning a reduced dimension representation of scRNA-seq data. The adversarial model tries to simultaneously optimize two objectives. The first is the accuracy of cell-type assignment and the second is the inability to distinguish the batch (domain). We tested the method by using the resulting representation to align several different data sets. As we show, by overcoming batch effects our method was able to correctly separate cell types, improving on several prior methods suggested for this task. Analysis of the top features used by the network indicates that by taking the batch impact into account, the reduced representation is much better able to focus on key genes for each cell type.
引用
收藏
页码:501 / 513
页数:13
相关论文
共 50 条
  • [1] Comparative Analysis of Supervised Cell Type Detection in Single-Cell RNA-seq Data
    Vasighizaker, Akram
    Hora, Sheena
    Trivedi, Yash
    Rueda, Luis
    [J]. BIOINFORMATICS AND BIOMEDICAL ENGINEERING, PT II, 2022, : 333 - 345
  • [2] SCRABBLE: single-cell RNA-seq imputation constrained by bulk RNA-seq data
    Peng, Tao
    Zhu, Qin
    Yin, Penghang
    Tan, Kai
    [J]. GENOME BIOLOGY, 2019, 20 (1)
  • [3] scSemiAAE: a semi-supervised clustering model for single-cell RNA-seq data
    Zile Wang
    Haiyun Wang
    Jianping Zhao
    Chunhou Zheng
    [J]. BMC Bioinformatics, 24
  • [4] SCRABBLE: single-cell RNA-seq imputation constrained by bulk RNA-seq data
    Tao Peng
    Qin Zhu
    Penghang Yin
    Kai Tan
    [J]. Genome Biology, 20
  • [5] scSemiAAE: a semi-supervised clustering model for single-cell RNA-seq data
    Wang, Zile
    Wang, Haiyun
    Zhao, Jianping
    Zheng, Chunhou
    [J]. BMC BIOINFORMATICS, 2023, 24 (01)
  • [6] Comparison of transformations for single-cell RNA-seq data
    Constantin Ahlmann-Eltze
    Wolfgang Huber
    [J]. Nature Methods, 2023, 20 : 665 - 672
  • [7] Comparison of transformations for single-cell RNA-seq data
    Ahlmann-Eltze, Constantin
    Huber, Wolfgang
    [J]. NATURE METHODS, 2023, 20 (05) : 665 - +
  • [8] An Efficient and Flexible Method for Deconvoluting Bulk RNA-Seq Data with Single-Cell RNA-Seq Data
    Sun, Xifang
    Sun, Shiquan
    Yang, Sheng
    [J]. CELLS, 2019, 8 (10)
  • [9] psupertime: supervised pseudotime analysis for time-series single-cell RNA-seq data
    Macnair, Will
    Gupta, Revant
    Claassen, Manfred
    [J]. BIOINFORMATICS, 2022, 38 (SUPPL 1) : 290 - 298
  • [10] Analysis of Single-Cell RNA-seq Data by Clustering Approaches
    Zhu, Xiaoshu
    Li, Hong-Dong
    Guo, Lilu
    Wu, Fang-Xiang
    Wang, Jianxin
    [J]. CURRENT BIOINFORMATICS, 2019, 14 (04) : 314 - 322