Bayesian integrative model for multi-omics data with missingness

被引:16
|
作者
Fang, Zhou [1 ]
Ma, Tianzhou [2 ]
Tang, Gong [1 ]
Zhu, Li [1 ]
Yan, Qi [3 ]
Wang, Ting [3 ]
Celedon, Juan C. [3 ]
Chen, Wei [1 ,3 ]
Tseng, George C. [1 ]
机构
[1] Univ Pittsburgh, Dept Biostat, Pittsburgh, PA 15261 USA
[2] Univ Maryland, Dept Epidemiol & Biostat, College Pk, MD 20742 USA
[3] UPMC, Childrens Hosp Pittsburgh, Div Pediat Pulmonol Allergy & Immunol, Pittsburgh, PA 15224 USA
基金
美国国家卫生研究院;
关键词
DISEASE SUBTYPE DISCOVERY; GENE-EXPRESSION;
D O I
10.1093/bioinformatics/bty775
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Integrative analysis of multi-omics data from different high-throughput experimental platforms provides valuable insight into regulatory mechanisms associated with complex diseases, and gains statistical power to detect markers that are otherwise overlooked by single-platform omics analysis. In practice, a significant portion of samples may not be measured completely due to insufficient tissues or restricted budget (e.g. gene expression profile are measured but not methylation). Current multi-omics integrative methods require complete data. A common practice is to ignore samples with any missing platform and perform complete case analysis, which leads to substantial loss of statistical power. Methods: In this article, inspired by the popular Integrative Bayesian Analysis of Genomics data (iBAG), we propose a full Bayesian model that allows incorporation of samples with missing omics data. Results: Simulation results show improvement of the new full Bayesian approach in terms of outcome prediction accuracy and feature selection performance when sample size is limited and proportion of missingness is large. When sample size is large or the proportion of missingness is low, incorporating samples with missingness may introduce extra inference uncertainty and generate worse prediction and feature selection performance. To determine whether and how to incorporate samples with missingness, we propose a self-learning cross-validation (CV) decision scheme. Simulations and a real application on child asthma dataset demonstrate superior performance of the CV decision scheme when various types of missing mechanisms are evaluated.
引用
收藏
页码:3801 / 3808
页数:8
相关论文
共 50 条
  • [21] trackViewer: a Bioconductor package for interactive and integrative visualization of multi-omics data
    Ou, Jianhong
    Zhu, Lihua Julie
    [J]. NATURE METHODS, 2019, 16 (06) : 453 - 454
  • [22] A pan-cancer integrative pathway analysis of multi-omics data
    Henry Linder
    Yuping Zhang
    [J]. Quantitative Biology., 2020, 8 (02) - 142
  • [23] Perspectives of using Cloud computing in integrative analysis of multi-omics data
    Augustyn, Dariusz R.
    Wycislik, Lukasz
    Mrozek, Dariusz
    [J]. BRIEFINGS IN FUNCTIONAL GENOMICS, 2021, 20 (04) : 198 - 206
  • [24] Integration of multi-omics data for integrative gene regulatory network inference
    Zarayeneh, Neda
    Ko, Euiseong
    Oh, Jung Hun
    Suh, Sang
    Liu, Chunyu
    Gao, Jean
    Kim, Donghyun
    Kang, Mingon
    [J]. INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2017, 18 (03) : 223 - 239
  • [25] TiMEG: an integrative statistical method for partially missing multi-omics data
    Das, Sarmistha
    Mukhopadhyay, Indranil
    [J]. SCIENTIFIC REPORTS, 2021, 11 (01)
  • [26] A pan-cancer integrative pathway analysis of multi-omics data
    Linder, Henry
    Zhang, Yuping
    [J]. QUANTITATIVE BIOLOGY, 2020, 8 (02) : 130 - 142
  • [27] A Customizable Analysis Flow in Integrative Multi-Omics
    Lancaster, Samuel M.
    Sanghi, Akshay
    Wu, Si
    Snyder, Michael P.
    [J]. BIOMOLECULES, 2020, 10 (12) : 1 - 15
  • [28] Survey on Multi-omics, and Multi-omics Data Analysis, Integration and Application
    Shahrajabian, Mohamad Hesam
    Sun, Wenli
    [J]. CURRENT PHARMACEUTICAL ANALYSIS, 2023, 19 (04) : 267 - 281
  • [29] A Review of Integrative Imputation for Multi-Omics Datasets
    Song, Meng
    Greenbaum, Jonathan
    Luttrell, Joseph
    Zhou, Weihua
    Wu, Chong
    Shen, Hui
    Gong, Ping
    Zhang, Chaoyang
    Deng, Hong-Wen
    [J]. FRONTIERS IN GENETICS, 2020, 11
  • [30] A multi-omics integrative network map of maize
    Linqian Han
    Wanshun Zhong
    Jia Qian
    Minliang Jin
    Peng Tian
    Wanchao Zhu
    Hongwei Zhang
    Yonghao Sun
    Jia-Wu Feng
    Xiangguo Liu
    Guo Chen
    Babar Farid
    Ruonan Li
    Zimo Xiong
    Zhihui Tian
    Juan Li
    Zi Luo
    Dengxiang Du
    Sijia Chen
    Qixiao Jin
    Jiaxin Li
    Zhao Li
    Yan Liang
    Xiaomeng Jin
    Yong Peng
    Chang Zheng
    Xinnan Ye
    Yuejia Yin
    Hong Chen
    Weifu Li
    Ling-Ling Chen
    Qing Li
    Jianbing Yan
    Fang Yang
    Lin Li
    [J]. Nature Genetics, 2023, 55 : 144 - 153