STATegra: Multi-Omics Data Integration - A Conceptual Scheme With a Bioinformatics Pipeline

被引:18
|
作者
Planell, Nuria [1 ]
Lagani, Vincenzo [2 ,3 ]
Sebastian-Leon, Patricia [4 ]
van der Kloet, Frans [5 ]
Ewing, Ewoud [6 ]
Karathanasis, Nestoras [7 ,8 ]
Urdangarin, Arantxa [1 ]
Arozarena, Imanol [9 ]
Jagodic, Maja [6 ]
Tsamardinos, Ioannis [3 ,10 ]
Tarazona, Sonia [11 ]
Conesa, Ana [12 ,13 ]
Tegner, Jesper [14 ,15 ,16 ]
Gomez-Cabrero, David [1 ,14 ,15 ,17 ]
机构
[1] Univ Publ Navarra UPNA, Translat Bioinformat Unit, Navarrabiomed, Complejo Hosp Navarra CHN,IdiSNA, Pamplona, Spain
[2] Ilia State Univ, Inst Chem Biol, Tbilisi, Georgia
[3] Gnosis Data Anal PC, Iraklion, Greece
[4] IVI RMA Inst Valenciano Infertilidad Reprod Med A, Dept Genom & Syst Reprod Med, Valencia, Spain
[5] Univ Amsterdam, Swammerdam Inst Life Sci, Amsterdam, Netherlands
[6] Karolinska Inst, Dept Clin Neurosci, Ctr Mol Med, Karolinska Univ Hosp, Stockholm, Sweden
[7] Fdn Res & Technol Hellas, Inst Comp Sci, Iraklion, Greece
[8] Thomas Jefferson Univ, Computat Med Ctr, Philadelphia, PA 19107 USA
[9] Univ Publ Navarra UPNA, Complejo Hosp Navarra CHN, Canc Signalling Unit, Navarrabiomed,Hlth Res Inst Navarre IdiSNA, Pamplona, Spain
[10] Univ Crete, Dept Comp Sci, Iraklion, Greece
[11] Univ Politecn Valencia, Dept Appl Stat Operat Res & Qual, Valencia, Spain
[12] Univ Florida, Inst Food & Agr Sci, Microbiol & Cell Sci, Gainesville, FL 32611 USA
[13] Univ Florida, Genet Inst, Gainesville, FL USA
[14] King Abdullah Univ Sci & Technol KAUST, Biol & Environm Sci & Engn Div, Thuwal, Saudi Arabia
[15] Karolinska Inst, Unit Computat Med, Dept Med, Ctr Mol Med,Karolinska Univ Hosp, Stockholm, Sweden
[16] Sci Life Lab, Solna, Sweden
[17] Kings Coll London, Mucosal & Salivary Biol Div, Inst Dent, London, England
关键词
multi-omit analyses; data-integration; next-generation sequencing; component analysis; non-parametric combination; GeneSetCluster; COMPONENT ANALYSIS; PROXIMAL GENES; GLIOBLASTOMA; EXPRESSION; SCLEROSIS; MECHANISMS; LANDSCAPE; MUTATION; PATHWAY; CELLS;
D O I
10.3389/fgene.2021.620453
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Technologies for profiling samples using different omics platforms have been at the forefront since the human genome project. Large-scale multi-omics data hold the promise of deciphering different regulatory layers. Yet, while there is a myriad of bioinformatics tools, each multi-omics analysis appears to start from scratch with an arbitrary decision over which tools to use and how to combine them. Therefore, it is an unmet need to conceptualize how to integrate such data and implement and validate pipelines in different cases. We have designed a conceptual framework (STATegra), aiming it to be as generic as possible for multi-omics analysis, combining available multi-omic anlaysis tools (machine learning component analysis, non-parametric data combination, and a multi-omics exploratory analysis) in a step-wise manner. While in several studies, we have previously combined those integrative tools, here, we provide a systematic description of the STATegra framework and its validation using two The Cancer Genome Atlas (TCGA) case studies. For both, the Glioblastoma and the Skin Cutaneous Melanoma (SKCM) cases, we demonstrate an enhanced capacity of the framework (and beyond the individual tools) to identify features and pathways compared to single-omics analysis. Such an integrative multi-omics analysis framework for identifying features and components facilitates the discovery of new biology. Finally, we provide several options for applying the STATegra framework when parametric assumptions are fulfilled and for the case when not all the samples are profiled for all omics. The STATegra framework is built using several tools, which are being integrated step-by-step as OpenSource in the STATegRa Bioconductor package.(1)
引用
收藏
页数:12
相关论文
共 50 条
  • [1] MOMIC: A Multi-Omics Pipeline for Data Analysis, Integration and Interpretation
    Madrid-Marquez, Laura
    Rubio-Escudero, Cristina
    Pontes, Beatriz
    Gonzalez-Perez, Antonio
    Riquelme, Jose C.
    Saez, Maria E.
    APPLIED SCIENCES-BASEL, 2022, 12 (08):
  • [2] Survey on Multi-omics, and Multi-omics Data Analysis, Integration and Application
    Shahrajabian, Mohamad Hesam
    Sun, Wenli
    CURRENT PHARMACEUTICAL ANALYSIS, 2023, 19 (04) : 267 - 281
  • [3] COMO: a pipeline for multi-omics data integration in metabolic modeling and drug discovery
    Bessell, Brandt
    Loecker, Josh
    Zhao, Zhongyuan
    Aghamiri, Sara Sadat
    Mohanty, Sabyasachi
    Amin, Rada
    Helikar, Tomas
    Puniya, Bhanwar Lal
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (06)
  • [4] Multi-omics data integration and analysis pipeline for precision medicine: Systematic review
    Abdelaziz, Esraa Hamdi
    Ismail, Rasha
    Mabrouk, Mai S.
    Amin, Eman
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2024, 113
  • [5] Bioinformatics finding novel biomarkers and mechanisms of ALS using multi-omics data integration
    Bonn, Stefan
    EUROPEAN JOURNAL OF CLINICAL INVESTIGATION, 2023, 53
  • [6] A cloud solution for multi-omics data integration
    Tordini, Fabio
    2016 INT IEEE CONFERENCES ON UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING AND COMMUNICATIONS, CLOUD AND BIG DATA COMPUTING, INTERNET OF PEOPLE, AND SMART WORLD CONGRESS (UIC/ATC/SCALCOM/CBDCOM/IOP/SMARTWORLD), 2016, : 559 - 566
  • [7] Towards multi-omics synthetic data integration
    Selvarajoo, Kumar
    Maurer-Stroh, Sebastian
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (03)
  • [8] Multi-Omics Factor Analysis-a framework for unsupervised integration of multi-omics data sets
    Argelaguet, Ricard
    Velten, Britta
    Arnol, Damien
    Dietrich, Sascha
    Zenz, Thorsten
    Marioni, John C.
    Buettner, Florian
    Huber, Wolfgang
    Stegle, Oliver
    MOLECULAR SYSTEMS BIOLOGY, 2018, 14 (06)
  • [9] Integrative Multi-Omics Through Bioinformatics
    Goh, Hoe-Han
    OMICS APPLICATIONS FOR SYSTEMS BIOLOGY, 2018, 1102 : 69 - 80
  • [10] Multi-omics data integration by generative adversarial network
    Ahmed, Khandakar Tanvir
    Sun, Jiao
    Cheng, Sze
    Yong, Jeongsik
    Zhang, Wei
    BIOINFORMATICS, 2022, 38 (01) : 179 - 186