Deep Learning Methods for Omics Data Imputation

被引:6
|
作者
Huang, Lei [1 ]
Song, Meng [1 ]
Shen, Hui [2 ]
Hong, Huixiao [3 ]
Gong, Ping [4 ]
Deng, Hong-Wen [2 ]
Zhang, Chaoyang [1 ]
机构
[1] Univ Southern Mississippi, Sch Comp Sci & Comp Engn, Hattiesburg, MS 39406 USA
[2] Tulane Univ, Ctr Biomed Informat & Genom, Sch Med, New Orleans, LA 70112 USA
[3] US FDA, Div Bioinformat & Biostat, Natl Ctr Toxicol Res, Jefferson, AR 72079 USA
[4] US Army Engineer Res & Dev Ctr, Environm Lab, Vicksburg, MS 39180 USA
来源
BIOLOGY-BASEL | 2023年 / 12卷 / 10期
基金
美国国家卫生研究院;
关键词
omics imputation; deep learning; multi-omics imputation;
D O I
10.3390/biology12101313
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Simple Summary Missing values are common in omics data and can arise from various causes. Imputation approaches offer a different means of handling missing data instead of utilizing only subsets of the dataset. However, the imputation of missing omics data is a challenging task. Advanced imputation methods such as deep learning-based approaches can model complex patterns and relationships in large and high-dimensional omics datasets, making them an increasingly popular choice for imputation. This review provides an overview of deep learning-based methods for omics data imputation, focusing on model architectures and multi-omics data imputation. This review also examines the challenges and opportunities that are associated with deep learning methods in this field.Abstract One common problem in omics data analysis is missing values, which can arise due to various reasons, such as poor tissue quality and insufficient sample volumes. Instead of discarding missing values and related data, imputation approaches offer an alternative means of handling missing data. However, the imputation of missing omics data is a non-trivial task. Difficulties mainly come from high dimensionality, non-linear or non-monotonic relationships within features, technical variations introduced by sampling methods, sample heterogeneity, and the non-random missingness mechanism. Several advanced imputation methods, including deep learning-based methods, have been proposed to address these challenges. Due to its capability of modeling complex patterns and relationships in large and high-dimensional datasets, many researchers have adopted deep learning models to impute missing omics data. This review provides a comprehensive overview of the currently available deep learning-based methods for omics imputation from the perspective of deep generative model architectures such as autoencoder, variational autoencoder, generative adversarial networks, and Transformer, with an emphasis on multi-omics data imputation. In addition, this review also discusses the opportunities that deep learning brings and the challenges that it might face in this field.
引用
收藏
页数:26
相关论文
共 50 条
  • [1] Coupling Deep Imputation with Multitask Learning for Downstream Tasks on Omics Data
    Peacock, Sophie
    Jacob, Etai
    Burlutskiy, Nikolay
    [J]. 2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [2] Machine learning and deep learning methods that use omics data for metastasis prediction
    Albaradei, Somayah
    Thafar, Maha
    Alsaedi, Asim
    Van Neste, Christophe
    Gojobori, Takashi
    Essack, Magbubah
    Gao, Xin
    [J]. COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2021, 19 : 5008 - 5018
  • [3] Deep learning versus conventional methods for missing data imputation: A review and comparative study
    Sun, Yige
    Li, Jing
    Xu, Yifan
    Zhang, Tingting
    Wang, Xiaofeng
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2023, 227
  • [4] A Deep Learning Based Approach for Traffic Data Imputation
    Duan, Yanjie
    Lv, Yisheng
    Kang, Wenwen
    Zhao, Yifei
    [J]. 2014 IEEE 17TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2014, : 912 - 917
  • [5] Imputation of protein activity data using deep learning
    Whitehead, Tom
    Hunt, Peter
    Pellegrini, Ben
    Segall, Matthew
    Conduit, Gareth
    [J]. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2018, 256
  • [6] Imputation of Assay Bioactivity Data Using Deep Learning
    Whitehead, T. M.
    Irwin, B. W. J.
    Hunt, P.
    Segall, M. D.
    Conduit, G. J.
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2019, 59 (03) : 1197 - 1204
  • [7] An efficient realization of deep learning for traffic data imputation
    Duan, Yanjie
    Lv, Yisheng
    Liu, Yu-Liang
    Wang, Fei-Yue
    [J]. TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2016, 72 : 168 - 181
  • [8] Data Imputation and Dimensionality Reduction Using Deep Learning in Industrial Data
    Zhou, Zhihong
    Mo, Jiao
    Shi, Yijie
    [J]. PROCEEDINGS OF 2017 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2017, : 2329 - 2333
  • [9] Traffic Data Imputation and Prediction: An Efficient Realization of Deep Learning
    Zhao, Junhui
    Nie, Yiwen
    Ni, Shanjin
    Sun, Xiaoke
    [J]. IEEE ACCESS, 2020, 8 : 46713 - 46722
  • [10] Practical applications of deep learning to imputation of drug discovery data
    Whitehead, Thomas
    Irwin, Benedict
    Hunt, Peter
    Segall, Matthew
    Conduit, Gareth
    [J]. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2019, 258