High-Order Correlation Integration for Single-Cell or Bulk RNA-seq Data Analysis

被引:8
|
作者
Tang, Hui [1 ]
Zeng, Tao [1 ]
Chen, Luonan [1 ,2 ,3 ,4 ]
机构
[1] Univ Chinese Acad Sci, CAS Ctr Excellence Mol Cell Sci, Inst Biochem & Cell Biol, Shanghai Inst Biol Sci,Key Lab Syst Biol,Chinese, Shanghai, Peoples R China
[2] Chinese Acad Sci, CAS Ctr Excellence Anim Evolut & Genet, Kunming, Yunnan, Peoples R China
[3] ShanghaiTech Univ, Sch Life Sci & Technol, Shanghai, Peoples R China
[4] Shanghai Res Ctr Brain Sci & Brain Inspired Intel, Shanghai, Peoples R China
来源
FRONTIERS IN GENETICS | 2019年 / 10卷
基金
上海市自然科学基金; 中国国家自然科学基金;
关键词
high-order; integration; clustering; single-cell; bulk data analysis; GENE-EXPRESSION; SIGNALING PATHWAYS; DISCOVERY; MODULES; HETEROGENEITY; EMBRYOS; COLON; MAPK;
D O I
10.3389/fgene.2019.00371
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Quantifying or labeling the sample type with high quality is a challenging task, which is a key step for understanding complex diseases. Reducing noise pollution to data and ensuring the extracted intrinsic patterns in concordance with the primary data structure are important in sample clustering and classification. Here we propose an effective data integration framework named as HCI (High-order Correlation Integration), which takes an advantage of high-order correlation matrix incorporated with pattern fusion analysis (PFA), to realize high-dimensional data feature extraction. On the one hand, the high-order Pearson's correlation coefficient can highlight the latent patterns underlying noisy input datasets and thus improve the accuracy and robustness of the algorithms currently available for sample clustering. On the other hand, the PFA can identify intrinsic sample patterns efficiently from different input matrices by optimally adjusting the signal effects. To validate the effectiveness of our new method, we firstly applied HCI on four single-cell RNA-seq datasets to distinguish the cell types, and we found that HCI is capable of identifying the prior-known cell types of single-cell samples from scRNA-seq data with higher accuracy and robustness than other methods under different conditions. Secondly, we also integrated heterogonous omics data from TCGA datasets and GEO datasets including bulk RNA-seq data, which outperformed the other methods at identifying distinct cancer subtypes. Within an additional case study, we also constructed the mRNA-miRNA regulatory network of colorectal cancer based on the feature weight estimated from HCI, where the differentially expressed mRNAs and miRNAs were significantly enriched in well-known functional sets of colorectal cancer, such as KEGG pathways and IPA disease annotations. All these results supported that HCI has extensive flexibility and applicability on sample clustering with different types and organizations of RNA-seq data.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] ascend: R package for analysis of single-cell RNA-seq data
    Senabouth, Anne
    Lukowski, Samuel W.
    Hernandez, Jose Alquicira
    Andersen, Stacey B.
    Mei, Xin
    Nguyen, Quan H.
    Powell, Joseph E.
    GIGASCIENCE, 2019, 8 (08):
  • [32] BingleSeq: a user-friendly R package for bulk and single-cell RNA-Seq data analysis
    Dimitrov, Daniel
    Gu, Quan
    PEERJ, 2020, 8
  • [33] Comprehensive analysis of single-cell RNA-seq and bulk RNA-seq revealed the heterogeneity and convergence of the immune microenvironment in renal cell carcinoma
    Shihui Lv
    Liping Tao
    Hongbing Liao
    Zhiming Huang
    Yongyong Lu
    Functional & Integrative Genomics, 2023, 23
  • [34] Modeling group heteroscedasticity in single-cell RNA-seq pseudo-bulk data
    Yue You
    Xueyi Dong
    Yong Kiat Wee
    Mhairi J. Maxwell
    Monther Alhamdoosh
    Gordon K. Smyth
    Peter F. Hickey
    Matthew E. Ritchie
    Charity W. Law
    Genome Biology, 24
  • [35] Comprehensive analysis of single-cell RNA-seq and bulk RNA-seq revealed the heterogeneity and convergence of the immune microenvironment in renal cell carcinoma
    Lv, Shihui
    Tao, Liping
    Liao, Hongbing
    Huang, Zhiming
    Lu, Yongyong
    FUNCTIONAL & INTEGRATIVE GENOMICS, 2023, 23 (02)
  • [36] Integrated analysis of single-cell RNA-seq and bulk RNA-seq unravels the heterogeneity of cancer-associated fibroblasts in TNBC
    Wu, Xiaoqing
    Lu, Wenping
    Zhang, Weixuan
    Zhang, Dongni
    Mei, Heting
    Zhang, Mengfan
    Cui, Yongjia
    Zhuo, Zhili
    AGING-US, 2023, 15 (21): : 12674 - 12697
  • [37] Integration of single-cell RNA-seq data into population models to characterize cancer metabolism
    Damiani, Chiara
    Maspero, Davide
    Di Filippo, Marzia
    Colombo, Riccardo
    Pescini, Dario
    Graudenzi, Alex
    Westerhoff, Hans Victor
    Alberghina, Lilia
    Vanoni, Marco
    Mauri, Giancarlo
    PLOS COMPUTATIONAL BIOLOGY, 2019, 15 (02)
  • [38] Integration of electrophysiological recordings with single-cell RNA-seq data identifies neuronal subtypes
    János Fuzik
    Amit Zeisel
    Zoltán Máté
    Daniela Calvigioni
    Yuchio Yanagawa
    Gábor Szabó
    Sten Linnarsson
    Tibor Harkany
    Nature Biotechnology, 2016, 34 : 175 - 183
  • [39] Transmorph: a unifying computational framework for modular single-cell RNA-seq data integration
    Fouche, Aziz
    Chadoutaud, Loic
    Delattre, Olivier
    Zinovyev, Andrei
    NAR GENOMICS AND BIOINFORMATICS, 2023, 5 (03)
  • [40] The effect of data transformation on low-dimensional integration of single-cell RNA-seq
    Park, Youngjun
    Hauschild, Anne-Christin
    BMC BIOINFORMATICS, 2024, 25 (01)