STATegra: Multi-Omics Data Integration - A Conceptual Scheme With a Bioinformatics Pipeline

被引:18
|
作者
Planell, Nuria [1 ]
Lagani, Vincenzo [2 ,3 ]
Sebastian-Leon, Patricia [4 ]
van der Kloet, Frans [5 ]
Ewing, Ewoud [6 ]
Karathanasis, Nestoras [7 ,8 ]
Urdangarin, Arantxa [1 ]
Arozarena, Imanol [9 ]
Jagodic, Maja [6 ]
Tsamardinos, Ioannis [3 ,10 ]
Tarazona, Sonia [11 ]
Conesa, Ana [12 ,13 ]
Tegner, Jesper [14 ,15 ,16 ]
Gomez-Cabrero, David [1 ,14 ,15 ,17 ]
机构
[1] Univ Publ Navarra UPNA, Translat Bioinformat Unit, Navarrabiomed, Complejo Hosp Navarra CHN,IdiSNA, Pamplona, Spain
[2] Ilia State Univ, Inst Chem Biol, Tbilisi, Georgia
[3] Gnosis Data Anal PC, Iraklion, Greece
[4] IVI RMA Inst Valenciano Infertilidad Reprod Med A, Dept Genom & Syst Reprod Med, Valencia, Spain
[5] Univ Amsterdam, Swammerdam Inst Life Sci, Amsterdam, Netherlands
[6] Karolinska Inst, Dept Clin Neurosci, Ctr Mol Med, Karolinska Univ Hosp, Stockholm, Sweden
[7] Fdn Res & Technol Hellas, Inst Comp Sci, Iraklion, Greece
[8] Thomas Jefferson Univ, Computat Med Ctr, Philadelphia, PA 19107 USA
[9] Univ Publ Navarra UPNA, Complejo Hosp Navarra CHN, Canc Signalling Unit, Navarrabiomed,Hlth Res Inst Navarre IdiSNA, Pamplona, Spain
[10] Univ Crete, Dept Comp Sci, Iraklion, Greece
[11] Univ Politecn Valencia, Dept Appl Stat Operat Res & Qual, Valencia, Spain
[12] Univ Florida, Inst Food & Agr Sci, Microbiol & Cell Sci, Gainesville, FL 32611 USA
[13] Univ Florida, Genet Inst, Gainesville, FL USA
[14] King Abdullah Univ Sci & Technol KAUST, Biol & Environm Sci & Engn Div, Thuwal, Saudi Arabia
[15] Karolinska Inst, Unit Computat Med, Dept Med, Ctr Mol Med,Karolinska Univ Hosp, Stockholm, Sweden
[16] Sci Life Lab, Solna, Sweden
[17] Kings Coll London, Mucosal & Salivary Biol Div, Inst Dent, London, England
关键词
multi-omit analyses; data-integration; next-generation sequencing; component analysis; non-parametric combination; GeneSetCluster; COMPONENT ANALYSIS; PROXIMAL GENES; GLIOBLASTOMA; EXPRESSION; SCLEROSIS; MECHANISMS; LANDSCAPE; MUTATION; PATHWAY; CELLS;
D O I
10.3389/fgene.2021.620453
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Technologies for profiling samples using different omics platforms have been at the forefront since the human genome project. Large-scale multi-omics data hold the promise of deciphering different regulatory layers. Yet, while there is a myriad of bioinformatics tools, each multi-omics analysis appears to start from scratch with an arbitrary decision over which tools to use and how to combine them. Therefore, it is an unmet need to conceptualize how to integrate such data and implement and validate pipelines in different cases. We have designed a conceptual framework (STATegra), aiming it to be as generic as possible for multi-omics analysis, combining available multi-omic anlaysis tools (machine learning component analysis, non-parametric data combination, and a multi-omics exploratory analysis) in a step-wise manner. While in several studies, we have previously combined those integrative tools, here, we provide a systematic description of the STATegra framework and its validation using two The Cancer Genome Atlas (TCGA) case studies. For both, the Glioblastoma and the Skin Cutaneous Melanoma (SKCM) cases, we demonstrate an enhanced capacity of the framework (and beyond the individual tools) to identify features and pathways compared to single-omics analysis. Such an integrative multi-omics analysis framework for identifying features and components facilitates the discovery of new biology. Finally, we provide several options for applying the STATegra framework when parametric assumptions are fulfilled and for the case when not all the samples are profiled for all omics. The STATegra framework is built using several tools, which are being integrated step-by-step as OpenSource in the STATegRa Bioconductor package.(1)
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Prospects and challenges of multi-omics data integration in toxicology
    Canzler, Sebastian
    Schor, Jana
    Busch, Wibke
    Schubert, Kristin
    Rolle-Kampczyk, Ulrike E.
    Seitz, Herve
    Kamp, Hennicke
    von Bergen, Martin
    Buesen, Roland
    Hackermueller, Joerg
    ARCHIVES OF TOXICOLOGY, 2020, 94 (02) : 371 - 388
  • [22] Vertical and horizontal integration of multi-omics data with miodin
    Ulfenborg, Benjamin
    BMC BIOINFORMATICS, 2019, 20 (01)
  • [23] 'Multi-omics' data integration: applications in probiotics studies
    Kwoji, Iliya Dauda
    Aiyegoro, Olayinka Ayobami
    Okpeku, Moses
    Adeleke, Matthew Adekunle
    NPJ SCIENCE OF FOOD, 2023, 7 (01)
  • [24] Review on Integration Analysis and Application of Multi-omics Data
    Zhong, Yating
    Lin, Yanmei
    Chen, Dingjia
    Peng, Yuzhong
    Zeng, Yuanpeng
    Computer Engineering and Applications, 2024, 57 (23) : 1 - 17
  • [25] A Commentary on Multi-omics Data Integration in Systems Vaccinology
    Shannon, Casey P.
    Lee, Amy H. Y.
    Tebbutt, Scott J.
    Singh, Amrit
    JOURNAL OF MOLECULAR BIOLOGY, 2024, 436 (08)
  • [26] Vertical and horizontal integration of multi-omics data with miodin
    Benjamin Ulfenborg
    BMC Bioinformatics, 20
  • [27] Machine learning for multi-omics data integration in cancer
    Cai, Zhaoxiang
    Poulos, Rebecca C.
    Liu, Jia
    Zhong, Qing
    ISCIENCE, 2022, 25 (02)
  • [28] MULTI-OMICS DATA INTEGRATION IN THE CONTEXT OF PRIMARY GLOMERULONEPHRITIS
    Fernandes, Marco
    Delles, Christian
    Husi, Holger
    NEPHROLOGY DIALYSIS TRANSPLANTATION, 2016, 31 : 165 - 165
  • [29] STATegra, a comprehensive multi-omics dataset of B-cell differentiation in mouse
    Gomez-Cabrero, David
    Tarazona, Sonia
    Ferreiros-Vidal, Isabel
    Ramirez, Ricardo N.
    Company, Carlos
    Schmidt, Andreas
    Reijmers, Theo
    von Saint Paul, Veronica
    Marabita, Francesco
    Rodriguez-Ubreva, Javier
    Garcia-Gomez, Antonio
    Carroll, Thomas
    Cooper, Lee
    Liang, Ziwei
    Dharmalingam, Gopuraja
    van der Kloet, Frans
    Harms, Amy C.
    Balzano-Nogueira, Leandro
    Lagani, Vincenzo
    Tsamardinos, Ioannis
    Lappe, Michael
    Maier, Dieter
    Westerhuis, Johan A. .
    Hankemeier, Thomas
    Imhof, Axel
    Ballestar, Esteban
    Mortazavi, Ali
    Merkenschlager, Matthias
    Egner, Jesper T.
    Conesa, Ana
    SCIENTIFIC DATA, 2019, 6 (1)
  • [30] Omics and Multi-Omics in IBD: No Integration, No Breakthroughs
    Fiocchi, Claudio
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2023, 24 (19)