Personalized regression enables sample-specific pan-cancer analysis

被引:9
|
作者
Lengerich, Benjamin J. [1 ]
Aragam, Bryon [2 ]
Xing, Eric P. [1 ,2 ,3 ]
机构
[1] Carnegie Mellon Univ, Dept Comp Sci, Pittsburgh, PA 15213 USA
[2] Carnegie Mellon Univ, Machine Learning Dept, Pittsburgh, PA 15213 USA
[3] Petuum Inc, Pittsburgh, PA 15222 USA
关键词
HETEROGENEITY; EXPRESSION; ONTOLOGY;
D O I
10.1093/bioinformatics/bty250
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: In many applications, inter-sample heterogeneity is crucial to understanding the complex biological processes under study. For example, in genomic analysis of cancers, each patient in a cohort may have a different driver mutation, making it difficult or impossible to identify causal mutations from an averaged view of the entire cohort. Unfortunately, many traditional methods for genomic analysis seek to estimate a single model which is shared by all samples in a population, ignoring this inter-sample heterogeneity entirely. In order to better understand patient heterogeneity, it is necessary to develop practical, personalized statistical models. Results: To uncover this inter-sample heterogeneity, we propose a novel regularizer for achieving patient-specific personalized estimation. This regularizer operates by learning two latent distance metrics-one between personalized parameters and one between clinical covariates- and attempting to match the induced distances as closely as possible. Crucially, we do not assume these distance metrics are already known. Instead, we allow the data to dictate the structure of these latent distance metrics. Finally, we apply our method to learn patient-specific, interpretable models for a pan-cancer gene expression dataset containing samples from more than 30 distinct cancer types and find strong evidence of personalization effects between cancer types as well as between individuals. Our analysis uncovers sample-specific aberrations that are overlooked by population-level methods, suggesting a promising new path for precision analysis of complex diseases such as cancer.
引用
收藏
页码:178 / 186
页数:9
相关论文
共 50 条
  • [1] Personalized analysis of breast cancer using sample-specific networks
    Zhu, Ke
    Ping, Cong
    Xiang, Qiong
    Liu, Xin
    Chen, Yuanyuan
    PEERJ, 2020, 8
  • [2] Learning Sample-Specific Models with Low-Rank Personalized Regression
    Lengerich, Benjamin
    Aragam, Bryon
    Xing, Eric P.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [3] Personalized characterization of diseases using sample-specific networks
    Liu, Xiaoping
    Wang, Yuetong
    Ji, Hongbin
    Aihara, Kazuyuki
    Chen, Luonan
    NUCLEIC ACIDS RESEARCH, 2016, 44 (22)
  • [4] Construction and analysis of sample-specific driver modules for breast cancer
    Chen, Yuanyuan
    Li, Haitao
    Sun, Xiao
    BMC GENOMICS, 2022, 23 (01)
  • [5] Construction and analysis of sample-specific driver modules for breast cancer
    Yuanyuan Chen
    Haitao Li
    Xiao Sun
    BMC Genomics, 23
  • [6] Multiple augmented reduced rank regression for pan-cancer analysis
    Wang, Jiuzhou
    Lock, Eric F.
    BIOMETRICS, 2024, 80 (01)
  • [7] Pathway-Based Personalized Analysis of Pan-Cancer Transcriptomic Data
    Pian, Cong
    He, Mengyuan
    Chen, Yuanyuan
    BIOMEDICINES, 2021, 9 (11)
  • [8] Pathway Activation Analysis for Pan-Cancer Personalized Characterization Based on Riemannian Manifold
    Li, Xingyi
    Hao, Jun
    Li, Junming
    Zhao, Zhelin
    Shang, Xuequn
    Li, Min
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2024, 25 (08)
  • [9] Pan-cancer analysis of neoepitopes
    Gabriel N. Teku
    Mauno Vihinen
    Scientific Reports, 8
  • [10] Pan-cancer analysis of neoepitopes
    Teku, Gabriel N.
    Vihinen, Mauno
    SCIENTIFIC REPORTS, 2018, 8