Bayesian integrative model for multi-omics data with missingness

被引:16
|
作者
Fang, Zhou [1 ]
Ma, Tianzhou [2 ]
Tang, Gong [1 ]
Zhu, Li [1 ]
Yan, Qi [3 ]
Wang, Ting [3 ]
Celedon, Juan C. [3 ]
Chen, Wei [1 ,3 ]
Tseng, George C. [1 ]
机构
[1] Univ Pittsburgh, Dept Biostat, Pittsburgh, PA 15261 USA
[2] Univ Maryland, Dept Epidemiol & Biostat, College Pk, MD 20742 USA
[3] UPMC, Childrens Hosp Pittsburgh, Div Pediat Pulmonol Allergy & Immunol, Pittsburgh, PA 15224 USA
基金
美国国家卫生研究院;
关键词
DISEASE SUBTYPE DISCOVERY; GENE-EXPRESSION;
D O I
10.1093/bioinformatics/bty775
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Integrative analysis of multi-omics data from different high-throughput experimental platforms provides valuable insight into regulatory mechanisms associated with complex diseases, and gains statistical power to detect markers that are otherwise overlooked by single-platform omics analysis. In practice, a significant portion of samples may not be measured completely due to insufficient tissues or restricted budget (e.g. gene expression profile are measured but not methylation). Current multi-omics integrative methods require complete data. A common practice is to ignore samples with any missing platform and perform complete case analysis, which leads to substantial loss of statistical power. Methods: In this article, inspired by the popular Integrative Bayesian Analysis of Genomics data (iBAG), we propose a full Bayesian model that allows incorporation of samples with missing omics data. Results: Simulation results show improvement of the new full Bayesian approach in terms of outcome prediction accuracy and feature selection performance when sample size is limited and proportion of missingness is large. When sample size is large or the proportion of missingness is low, incorporating samples with missingness may introduce extra inference uncertainty and generate worse prediction and feature selection performance. To determine whether and how to incorporate samples with missingness, we propose a self-learning cross-validation (CV) decision scheme. Simulations and a real application on child asthma dataset demonstrate superior performance of the CV decision scheme when various types of missing mechanisms are evaluated.
引用
收藏
页码:3801 / 3808
页数:8
相关论文
共 50 条
  • [1] Generalized Bayesian Factor Analysis for Integrative Clustering with Applications to Multi-Omics Data
    Min, Eun Jeong
    Chang, Changgee
    Long, Qi
    [J]. 2018 IEEE 5TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2018, : 109 - 119
  • [2] Integrative clustering methods for multi-omics data
    Zhang, Xiaoyu
    Zhou, Zhenwei
    Xu, Hanfei
    Liu, Ching-Ti
    [J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2022, 14 (03)
  • [3] Integrative Clustering Analysis for Omics Data with Missingness
    Zhao, Yinqi
    Darst, Burcu
    Conti, David V.
    [J]. GENETIC EPIDEMIOLOGY, 2021, 45 (07) : 806 - 806
  • [4] Integrative analysis of multi-omics data for liquid biopsy
    Chen, Geng
    Zhang, Jing
    Fu, Qiaoting
    Taly, Valerie
    Tan, Fei
    [J]. BRITISH JOURNAL OF CANCER, 2023, 128 (04) : 702 - 702
  • [5] Sliced inverse regression for integrative multi-omics data analysis
    Jain, Yashita
    Ding, Shanshan
    Qiu, Jing
    [J]. STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2019, 18 (01)
  • [6] Correction: Integrative analysis of multi-omics data for liquid biopsy
    Geng Chen
    Jing Zhang
    Qiaoting Fu
    Valerie Taly
    Fei Tan
    [J]. British Journal of Cancer, 2023, 128 : 702 - 702
  • [7] Dimension reduction techniques for the integrative analysis of multi-omics data
    Meng, Chen
    Zeleznik, Oana A.
    Thallinger, Gerhard G.
    Kuster, Bernhard
    Gholami, Amin M.
    Culhane, Aedin C.
    [J]. BRIEFINGS IN BIOINFORMATICS, 2016, 17 (04) : 628 - 641
  • [8] Evaluation of integrative clustering methods for the analysis of multi-omics data
    Chauvel, Cecile
    Novoloaca, Alexei
    Veyre, Pierre
    Reynier, Frederic
    Becker, Jeremie
    [J]. BRIEFINGS IN BIOINFORMATICS, 2020, 21 (02) : 541 - 552
  • [9] Comparative analysis of integrative classification methods for multi-omics data
    Novoloaca, Alexei
    Broc, Camilo
    Beloeil, Laurent
    Yu, Wen-Han
    Becker, Jeremie
    [J]. BRIEFINGS IN BIOINFORMATICS, 2024, 25 (04)
  • [10] MODMatcher: Multi-Omics Data Matcher for Integrative Genomic Analysis
    Yoo, Seungyeul
    Huang, Tao
    Campbell, Joshua D.
    Lee, Eunjee
    Tu, Zhidong
    Geraci, Mark W.
    Powell, Charles A.
    Schadt, Eric E.
    Spira, Avrum
    Zhu, Jun
    [J]. PLOS COMPUTATIONAL BIOLOGY, 2014, 10 (08)