swCAM: estimation of subtype-specific expressions in individual samples with unsupervised sample-wise deconvolution

被引:3
|
作者
Chen, Lulu [1 ]
Wu, Chiung-Ting [1 ]
Lin, Chia-Hsiang [2 ]
Dai, Rujia [3 ]
Liu, Chunyu [3 ]
Clarke, Robert [4 ]
Yu, Guoqiang [1 ]
Van Eyk, Jennifer E. [5 ]
Herrington, David M. [6 ]
Wang, Yue [1 ]
机构
[1] Virginia Polytech Inst & State Univ, Dept Elect & Comp Engn, Arlington, VA 22203 USA
[2] Natl Cheng Kung Univ, Dept Elect Engn, Tainan 70101, Taiwan
[3] SUNY Upstate Med Univ, Dept Psychiat, Syracuse, NY 13210 USA
[4] Univ Minnesota, Hormel Inst, 801 16th Ave NE, Austin, MN 55912 USA
[5] Cedars Sinai Med Ctr, Adv Clin Biosyst Res Inst, Los Angeles, CA 90048 USA
[6] Wake Forest Univ, Dept Internal Med, Winston Salem, NC 27157 USA
基金
美国国家卫生研究院;
关键词
D O I
10.1093/bioinformatics/btab839
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Complex biological tissues are often a heterogeneous mixture of several molecularly distinct cell subtypes. Both subtype compositions and subtype-specific (STS) expressions can vary across biological conditions. Computational deconvolution aims to dissect patterns of bulk tissue data into subtype compositions and STS expressions. Existing deconvolution methods can only estimate averaged STS expressions in a population, while many downstream analyses such as inferring co-expression networks in particular subtypes require subtype expression estimates in individual samples. However, individual-level deconvolution is a mathematically underdetermined problem because there are more variables than observations. Results: We report a sample-wise Convex Analysis of Mixtures (swCAM) method that can estimate subtype proportions and STS expressions in individual samples from bulk tissue transcriptomes. We extend our previous CAM framework to include a new term accounting for between-sample variations and formulate swCAM as a nuclear-norm and l(2,1)-norm regularized matrix factorization problem. We determine hyperparameter values using cross-validation with random entry exclusion and obtain a swCAM solution using an efficient alternating direction method of multipliers. Experimental results on realistic simulation data show that swCAM can accurately estimate STS expressions in individual samples and successfully extract co-expression networks in particular subtypes that are otherwise unobtainable using bulk data. In two real-world applications, swCAM analysis of bulk RNASeq data from brain tissue of cases and controls with bipolar disorder or Alzheimer's disease identified significant changes in cell proportion, expression pattern and co-expression module in patient neurons. Comparative evaluation of swCAM versus peer methods is also provided.
引用
收藏
页码:1403 / 1410
页数:8
相关论文
empty
未找到相关数据