An extension of latent unknown clustering integrating multi-omics data (LUCID) incorporating incomplete omics data

被引:0
|
作者
Zhao, Yinqi [1 ]
Jia, Qiran [1 ]
Goodrich, Jesse [1 ]
Darst, Burcu [2 ]
Conti, David, V [1 ]
机构
[1] Univ Southern Calif, Keck Sch Med, Dept Populat & Publ Hlth Sci, 1845 N Soto St, Los Angeles, CA 90033 USA
[2] Fred Hutch Canc Ctr, Publ Hlth Sci Div, Seattle, WA 98109 USA
来源
BIOINFORMATICS ADVANCES | 2024年 / 4卷 / 01期
基金
美国国家卫生研究院;
关键词
VARIABLE SELECTION; MODEL; JOINT;
D O I
10.1093/bioadv/vbae123
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Motivation Latent unknown clustering integrating multi-omics data is a novel statistical model designed for multi-omics data analysis. It integrates omics data with exposures and an outcome through a latent cluster, elucidating how exposures influence processes reflected in multi-omics measurements, ultimately affecting an outcome. A significant challenge in multi-omics analysis is the issue of list-wise missingness. To address this, we extend the model to incorporate list-wise missingness within an integrated imputation framework, which can also handle sporadic missingness when necessary.Results Simulation studies demonstrate that our integrated imputation approach produces consistent and less biased estimates, closely reflecting true underlying values. We applied this model to data from the ISGlobal/ATHLETE "Exposome Data Challenge Event" to explore the association between maternal exposure to hexachlorobenzene and childhood body mass index by integrating incomplete proteomics data from 1301 children. The model successfully estimated proteomics profiles for two clusters representing higher and lower body mass index, characterizing the potential profiles linking prenatal hexachlorobenzene levels and childhood body mass index.Availability and implementation The proposed methods have been implemented in the R package LUCIDus. The source code is available at https://github.com/USCbiostats/LUCIDus.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] A latent unknown clustering integrating multi-omics data (LUCID) with phenotypic traits
    Peng, Cheng
    Wang, Jun
    Asante, Isaac
    Louie, Stan
    Jin, Ran
    Chatzi, Lida
    Casey, Graham
    Thomas, Duncan C.
    Conti, David, V
    [J]. BIOINFORMATICS, 2020, 36 (03) : 842 - 850
  • [2] LUCID: An Integrative Clustering Model for Multi Omics Data
    Zhao, Yinqi
    Conti, David V.
    [J]. GENETIC EPIDEMIOLOGY, 2022, 46 (07) : 550 - 550
  • [3] Integrating multi-omics data for crop improvement
    Scossa, Federico
    Alseekh, Saleh
    Fernie, Alisdair R.
    [J]. JOURNAL OF PLANT PHYSIOLOGY, 2021, 257
  • [4] Integrative clustering methods for multi-omics data
    Zhang, Xiaoyu
    Zhou, Zhenwei
    Xu, Hanfei
    Liu, Ching-Ti
    [J]. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2022, 14 (03)
  • [5] Representation Learning for the Clustering of Multi-Omics Data
    Viaud, Gautier
    Mayilvahanan, Prasanna
    Cournede, Paul-Henry
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2022, 19 (01) : 135 - 145
  • [6] On a novel statistical method for integrating multi-omics data
    Das, Sarmistha
    Mukhopadhyay, Indranil
    [J]. GENETIC EPIDEMIOLOGY, 2020, 44 (05) : 506 - 506
  • [7] Approaches to Integrating Metabolomics and Multi-Omics Data: A Primer
    Jendoubi, Takoua
    [J]. METABOLITES, 2021, 11 (03)
  • [8] Spectral clustering of weighted variables on multi-omics data
    Lee, Yunjung
    Park, Seyoung
    [J]. KOREAN JOURNAL OF APPLIED STATISTICS, 2023, 36 (03) : 175 - 196
  • [9] A survey on data integration for multi-omics sample clustering
    Lovino, Marta
    Randazzo, Vincenzo
    Ciravegna, Gabriele
    Barbiero, Pietro
    Ficarra, Elisa
    Cirrincione, Giansalvo
    [J]. NEUROCOMPUTING, 2022, 488 : 494 - 508
  • [10] Survey on Multi-omics, and Multi-omics Data Analysis, Integration and Application
    Shahrajabian, Mohamad Hesam
    Sun, Wenli
    [J]. CURRENT PHARMACEUTICAL ANALYSIS, 2023, 19 (04) : 267 - 281