scVAEBGM: Clustering Analysis of Single-Cell ATAC-seq Data Using a Deep Generative Model

被引:2
|
作者
Duan, Hongyu [1 ]
Li, Feng [1 ]
Shang, Junliang [1 ]
Liu, Jinxing [1 ]
Li, Yan [2 ]
Liu, Xikui [2 ]
机构
[1] Qufu Normal Univ, Sch Comp Sci, Rizhao 276826, Peoples R China
[2] Shandong Univ Sci & Technol, Dept Elect Engn & Informat Technol, Jinan 250031, Shandong, Peoples R China
基金
中国国家自然科学基金;
关键词
scATAC-seq; Clustering; Deep learning; Variational autoencoder; Bayesian Gaussian-mixture model;
D O I
10.1007/s12539-022-00536-w
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
A surge in research has occurred because of current developments in single-cell technologies. Above all, single-cell Assay for Transposase-Accessible Chromatin with high throughput sequencing (scATAC-seq) is a popular approach of analyzing chromatin accessibility differences at the level of single cell, either within or between groups. As a result, it is critical to examine cell heterogeneity at a previously unseen level and to identify both recognized and unknown cell types. However, with the ever-increasing number of cells engendered by technological development and the characteristics of the data, such as high noise, sparsity and dimension, challenges in distinguishing cell types have emerged. We propose scVAEBGM, which integrates a Variational Autoencoder (VAE) with a Bayesian Gaussian-mixture model (BGM) to process and analyze scATAC-seq data. This method combines and takes benefits of a Bayesian Gaussian mixture model to estimate the number of cell types without determining the cluster number in a beforehand. In other words, the size of the clusters is inferred from the data, thus avoiding biases introduced by subjective assessments when manually determining the size of the clusters. Additionally, the method is more robust to noise and can better represent single-cell data in lower dimensions. We also create a further clustering strategy. It is indicated by experiments that further clustering based on the already completed clustering can improve the clustering accuracy again. We test on six public datasets, and scVAEBGM outperforms various dimension reduction baselines. In downstream applications, scVAEBGM can reveal biological cell types. [GRAPHICS] .
引用
下载
收藏
页码:917 / 928
页数:12
相关论文
共 50 条
  • [31] scReadSim: a single-cell RNA-seq and ATAC-seq read simulator
    Guanao Yan
    Dongyuan Song
    Jingyi Jessica Li
    Nature Communications, 14
  • [32] Integrative single-cell RNA-seq and ATAC-seq analysis of myogenic differentiation in pig
    Shufang Cai
    Bin Hu
    Xiaoyu Wang
    Tongni Liu
    Zhuhu Lin
    Xian Tong
    Rong Xu
    Meilin Chen
    Tianqi Duo
    Qi Zhu
    Ziyun Liang
    Enru Li
    Yaosheng Chen
    Jianhao Li
    Xiaohong Liu
    Delin Mo
    BMC Biology, 21
  • [33] Integrative Single-Cell RNA-Seq and ATAC-Seq Analysis of Human Developmental Hematopoiesis
    Ranzoni, Anna Maria
    Tangherloni, Andrea
    Berest, Ivan
    Riva, Simone Giovanni
    Myers, Brynelle
    Strzelecka, Paulina M.
    Xu, Jiarui
    Panada, Elisa
    Mohorianu, Irina
    Zaugg, Judith B.
    Cvejic, Ana
    CELL STEM CELL, 2021, 28 (03) : 472 - +
  • [34] scReadSim: a single-cell RNA-seq and ATAC-seq read simulator
    Yan, Guanao
    Song, Dongyuan
    Li, Jingyi Jessica
    NATURE COMMUNICATIONS, 2023, 14 (01)
  • [35] epiAneufinder identifies copy number alterations from single-cell ATAC-seq data
    Ramakrishnan, Akshaya
    Symeonidi, Aikaterini
    Hanel, Patrick
    Schmid, Katharina T.
    Richter, Maria L.
    Schubert, Michael
    Colome-Tatche, Maria
    NATURE COMMUNICATIONS, 2023, 14 (01)
  • [36] cisTopic: cis-regulatory topic modeling on single-cell ATAC-seq data
    Gonzalez-Blas, Carmen Bravo
    Minnoye, Liesbeth
    Papasokrati, Dafni
    Aibar, Sara
    Hulselmans, Gert
    Christiaens, Valerie
    Davie, Kristofer
    Wouters, Jasper
    Aerts, Stein
    NATURE METHODS, 2019, 16 (05) : 397 - +
  • [37] Incorporating network diffusion and peak location information for better single-cell ATAC-seq data analysis
    Yu, Jiating
    Leng, Jiacheng
    Hou, Zhichao
    Sun, Duanchen
    Wu, Ling-Yun
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (02)
  • [38] Chromatin-accessibility estimation from single-cell ATAC-seq data with scOpen
    Li, Zhijian
    Kuppe, Christoph
    Ziegler, Susanne
    Cheng, Mingbo
    Kabgani, Nazanin
    Menzel, Sylvia
    Zenke, Martin
    Kramann, Rafael
    Costa, Ivan G.
    NATURE COMMUNICATIONS, 2021, 12 (01)
  • [39] Chromatin-accessibility estimation from single-cell ATAC-seq data with scOpen
    Zhijian Li
    Christoph Kuppe
    Susanne Ziegler
    Mingbo Cheng
    Nazanin Kabgani
    Sylvia Menzel
    Martin Zenke
    Rafael Kramann
    Ivan G. Costa
    Nature Communications, 12
  • [40] cisTopic: cis-regulatory topic modeling on single-cell ATAC-seq data
    Carmen Bravo González-Blas
    Liesbeth Minnoye
    Dafni Papasokrati
    Sara Aibar
    Gert Hulselmans
    Valerie Christiaens
    Kristofer Davie
    Jasper Wouters
    Stein Aerts
    Nature Methods, 2019, 16 : 397 - 400