Prediction of disease-free survival for precision medicine using cooperative learning on multi-omic data

被引:1
|
作者
Hahn, Georg [1 ]
Prokopenko, Dmitry [2 ]
Hecker, Julian [3 ]
Lutz, Sharon M. [1 ]
Mullin, Kristina [2 ]
Sejour, Leinal
Hide, Winston [4 ]
Vlachos, Ioannis [4 ]
DeSantis, Stacia [5 ]
Tanzi, Rudolph E. [2 ]
Lange, Christoph [1 ]
机构
[1] Harvard TH Chan Sch Publ Hlth, Dept Biostat, 677 Huntington Ave, Boston, MA 02115 USA
[2] Massachusetts Gen Hosp MGH, McCance Ctr Brain Hlth, Dept Neurol, Genet & Aging Res Unit, Boston, MA 02114 USA
[3] Brigham & Womens Hosp, Harvard Med Sch, Dept Med, Cardiovasc Div, 75 Francis St, Boston, MA 02115 USA
[4] Beth Israel Deaconess Med Ctr, Dept Pathol, 330 Brookline Ave, Boston, MA 02215 USA
[5] Univ Texas Hlth Sci Ctr Houston, 1200 Pressler St,Houston Campus, Houston, TX 77030 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
Alzheimer; cooperative learning; Cox proportional hazard; lasso; penalized regression; precision medicine; survival; INSIGHTS; RISK;
D O I
10.1093/bib/bbae267
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
In precision medicine, both predicting the disease susceptibility of an individual and forecasting its disease-free survival are areas of key research. Besides the classical epidemiological predictor variables, data from multiple (omic) platforms are increasingly available. To integrate this wealth of information, we propose new methodology to combine both cooperative learning, a recent approach to leverage the predictive power of several datasets, and polygenic hazard score models. Polygenic hazard score models provide a practitioner with a more differentiated view of the predicted disease-free survival than the one given by merely a point estimate, for instance computed with a polygenic risk score. Our aim is to leverage the advantages of cooperative learning for the computation of polygenic hazard score models via Cox's proportional hazard model, thereby improving the prediction of the disease-free survival. In our experimental study, we apply our methodology to forecast the disease-free survival for Alzheimer's disease (AD) using three layers of data. One layer contains epidemiological variables such as sex, APOE (apolipoprotein E, a genetic risk factor for AD) status and 10 leading principal components. Another layer contains selected genomic loci, and the last layer contains methylation data for selected CpG sites. We demonstrate that the survival curves computed via cooperative learning yield an AUC of around $0.7$, above the state-of-the-art performance of its competitors. Importantly, the proposed methodology returns (1) a linear score that can be easily interpreted (in contrast to machine learning approaches), and (2) a weighting of the predictive power of the involved data layers, allowing for an assessment of the importance of each omic (or other) platform. Similarly to polygenic hazard score models, our methodology also allows one to compute individual survival curves for each patient.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Combining explainable machine learning, demographic and multi-omic data to inform precision medicine strategies for inflammatory bowel disease
    Gardiner, Laura-Jayne
    Carrieri, Anna Paola
    Bingham, Karen
    Macluskie, Graeme
    Bunton, David
    McNeil, Marian
    Pyzer-Knapp, Edward O.
    PLOS ONE, 2022, 17 (02):
  • [2] Bayesian simultaneous factorization and prediction using multi-omic data
    Samorodnitsky, Sarah
    Wendt, Chris H.
    Lock, Eric F.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2024, 197
  • [3] A Multi-Omic Precision Medicine Clinical Trial in Acute Leukemia
    Becker, Pamela S.
    Oehler, Vivian G.
    Blau, Carl Anthony
    Martins, Timothy S.
    Curley, Niall
    Chien, Sylvia
    Dai, Jin
    Kauer, Nicole
    Yeung, Ka Yee
    Hung, Ling-Hong
    Hammer, Cody
    Hendrie, Paul C.
    Percival, Mary-Elizabeth M.
    Cassaday, Ryan D.
    Scott, Bart L.
    Walter, Roland B.
    Gardner, Kelda
    Gwin, Mary
    Smith, Heather
    Carson, Andrew
    Patay, Bradley
    Estey, Elihu H.
    BLOOD, 2019, 134
  • [4] Multi-omic data analysis using Galaxy
    Boekel, Jorrit
    Chilton, John M.
    Cooke, Ira R.
    Horvatovich, Peter L.
    Jagtap, Pratik D.
    Kall, Lukas
    Lehtio, Janne
    Lukasse, Pieter
    Moerland, Perry D.
    Griffin, Timothy J.
    NATURE BIOTECHNOLOGY, 2015, 33 (02) : 137 - 139
  • [5] Multi-omic data analysis using Galaxy
    Jorrit Boekel
    John M Chilton
    Ira R Cooke
    Peter L Horvatovich
    Pratik D Jagtap
    Lukas Käll
    Janne Lehtiö
    Pieter Lukasse
    Perry D Moerland
    Timothy J Griffin
    Nature Biotechnology, 2015, 33 : 137 - 139
  • [6] A Gene Signature of Survival Prediction for Kidney Renal Cell Carcinoma by Multi-Omic Data Analysis
    Hu, Fuyan
    Zeng, Wenying
    Liu, Xiaoping
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2019, 20 (22)
  • [7] Prediction of disease-free survival in gastric cancer using gene expression data
    Boussioutas, A
    Van Laar, R
    Desmond, P
    Bowtell, D
    GASTROENTEROLOGY, 2003, 124 (04) : A554 - A555
  • [8] Deep Learning and Networks for Integrative Analysis of Multi-Omic Data
    Zhang, Aidong
    2018 IEEE 8TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL ADVANCES IN BIO AND MEDICAL SCIENCES (ICCABS), 2018,
  • [9] Identification of optimal prediction models using multi-omic data for selecting hybrid rice
    Wang, Shibo
    Wei, Julong
    Li, Ruidong
    Qu, Han
    Chater, John M.
    Ma, Renyuan
    Li, Yonghao
    Xie, Weibo
    Jia, Zhenyu
    HEREDITY, 2019, 123 (03) : 395 - 406
  • [10] Identification of optimal prediction models using multi-omic data for selecting hybrid rice
    Shibo Wang
    Julong Wei
    Ruidong Li
    Han Qu
    John M. Chater
    Renyuan Ma
    Yonghao Li
    Weibo Xie
    Zhenyu Jia
    Heredity, 2019, 123 : 395 - 406