STATegra: Multi-Omics Data Integration - A Conceptual Scheme With a Bioinformatics Pipeline

被引:18
|
作者
Planell, Nuria [1 ]
Lagani, Vincenzo [2 ,3 ]
Sebastian-Leon, Patricia [4 ]
van der Kloet, Frans [5 ]
Ewing, Ewoud [6 ]
Karathanasis, Nestoras [7 ,8 ]
Urdangarin, Arantxa [1 ]
Arozarena, Imanol [9 ]
Jagodic, Maja [6 ]
Tsamardinos, Ioannis [3 ,10 ]
Tarazona, Sonia [11 ]
Conesa, Ana [12 ,13 ]
Tegner, Jesper [14 ,15 ,16 ]
Gomez-Cabrero, David [1 ,14 ,15 ,17 ]
机构
[1] Univ Publ Navarra UPNA, Translat Bioinformat Unit, Navarrabiomed, Complejo Hosp Navarra CHN,IdiSNA, Pamplona, Spain
[2] Ilia State Univ, Inst Chem Biol, Tbilisi, Georgia
[3] Gnosis Data Anal PC, Iraklion, Greece
[4] IVI RMA Inst Valenciano Infertilidad Reprod Med A, Dept Genom & Syst Reprod Med, Valencia, Spain
[5] Univ Amsterdam, Swammerdam Inst Life Sci, Amsterdam, Netherlands
[6] Karolinska Inst, Dept Clin Neurosci, Ctr Mol Med, Karolinska Univ Hosp, Stockholm, Sweden
[7] Fdn Res & Technol Hellas, Inst Comp Sci, Iraklion, Greece
[8] Thomas Jefferson Univ, Computat Med Ctr, Philadelphia, PA 19107 USA
[9] Univ Publ Navarra UPNA, Complejo Hosp Navarra CHN, Canc Signalling Unit, Navarrabiomed,Hlth Res Inst Navarre IdiSNA, Pamplona, Spain
[10] Univ Crete, Dept Comp Sci, Iraklion, Greece
[11] Univ Politecn Valencia, Dept Appl Stat Operat Res & Qual, Valencia, Spain
[12] Univ Florida, Inst Food & Agr Sci, Microbiol & Cell Sci, Gainesville, FL 32611 USA
[13] Univ Florida, Genet Inst, Gainesville, FL USA
[14] King Abdullah Univ Sci & Technol KAUST, Biol & Environm Sci & Engn Div, Thuwal, Saudi Arabia
[15] Karolinska Inst, Unit Computat Med, Dept Med, Ctr Mol Med,Karolinska Univ Hosp, Stockholm, Sweden
[16] Sci Life Lab, Solna, Sweden
[17] Kings Coll London, Mucosal & Salivary Biol Div, Inst Dent, London, England
关键词
multi-omit analyses; data-integration; next-generation sequencing; component analysis; non-parametric combination; GeneSetCluster; COMPONENT ANALYSIS; PROXIMAL GENES; GLIOBLASTOMA; EXPRESSION; SCLEROSIS; MECHANISMS; LANDSCAPE; MUTATION; PATHWAY; CELLS;
D O I
10.3389/fgene.2021.620453
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Technologies for profiling samples using different omics platforms have been at the forefront since the human genome project. Large-scale multi-omics data hold the promise of deciphering different regulatory layers. Yet, while there is a myriad of bioinformatics tools, each multi-omics analysis appears to start from scratch with an arbitrary decision over which tools to use and how to combine them. Therefore, it is an unmet need to conceptualize how to integrate such data and implement and validate pipelines in different cases. We have designed a conceptual framework (STATegra), aiming it to be as generic as possible for multi-omics analysis, combining available multi-omic anlaysis tools (machine learning component analysis, non-parametric data combination, and a multi-omics exploratory analysis) in a step-wise manner. While in several studies, we have previously combined those integrative tools, here, we provide a systematic description of the STATegra framework and its validation using two The Cancer Genome Atlas (TCGA) case studies. For both, the Glioblastoma and the Skin Cutaneous Melanoma (SKCM) cases, we demonstrate an enhanced capacity of the framework (and beyond the individual tools) to identify features and pathways compared to single-omics analysis. Such an integrative multi-omics analysis framework for identifying features and components facilitates the discovery of new biology. Finally, we provide several options for applying the STATegra framework when parametric assumptions are fulfilled and for the case when not all the samples are profiled for all omics. The STATegra framework is built using several tools, which are being integrated step-by-step as OpenSource in the STATegRa Bioconductor package.(1)
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Integration of multi-omics and non-omics data: AI approaches and challenges
    Lopez de Maturana, Evangelina
    Sabroso, Sergio
    Malats, Nuria
    HUMAN HEREDITY, 2022, VOL. (SUPPL 1) : 24 - 24
  • [32] Development of bioinformatics and multi-omics analyses in organoids
    Ha, Doyeon
    Kong, JungHo
    Kim, Donghyo
    Lee, Kwanghwan
    Lee, Juhun
    Park, Minhyuk
    Ahn, Hyunsoo
    Oh, Youngchul
    Kim, Sanguk
    BMB REPORTS, 2023, 56 (01) : 43 - 48
  • [33] Progress of bioinformatics studies for multi-omics and multi- modal data in complex diseases
    Liu, Xiaofan
    Lu, Zhi John
    CHINESE SCIENCE BULLETIN-CHINESE, 2024, 69 (30): : 4432 - 4446
  • [34] An antigen discovery pipeline integrates multi-omics data and informs immunotherapy
    Huber, Florian
    Bassani-Sternberg, Michal
    NATURE BIOTECHNOLOGY, 2024,
  • [35] Integration strategies of multi-omics data for machine learning analysis
    Picard M.
    Scott-Boyer M.-P.
    Bodein A.
    Périn O.
    Droit A.
    Computational and Structural Biotechnology Journal, 2021, 19 : 3735 - 3746
  • [36] A roadmap for multi-omics data integration using deep learning
    Kang, Mingon
    Ko, Euiseong
    Mersha, Tesfaye B.
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (01)
  • [37] timeOmics: an R package for longitudinal multi-omics data integration
    Bodein, Antoine
    Scott-Boyer, Marie-Pier
    Perin, Olivier
    Cao, Kim-Anh Le
    Droit, Arnaud
    BIOINFORMATICS, 2022, 38 (02) : 577 - 579
  • [38] Multi-omics data integration methods and their applications in psychiatric disorders
    Sathyanarayanan, Anita
    Mueller, Tamara T.
    Moni, Mohammad Ali
    Schueler, Katja
    Baune, Bernhard T.
    Lio, Pietro
    Mehta, Divya
    EUROPEAN NEUROPSYCHOPHARMACOLOGY, 2023, 69 : 26 - 46
  • [39] A guide to multi-omics data collection and integration for translational medicine
    Athieniti, Efi
    Spyrou, George M.
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2023, 21 : 134 - 149
  • [40] Multi-omics data integration shines a light on the renal medulla
    Hodgin, Jeffrey B.
    Smith, Cathy
    Kretzler, Matthias
    KIDNEY INTERNATIONAL, 2024, 105 (02) : 242 - 244