Processing genome-wide association studies within a repository of heterogeneous genomic datasets

被引:1
|
作者
Bernasconi, Anna [1 ]
Canakoglu, Arif [1 ]
Comolli, Federico [1 ]
机构
[1] Politecn Milan, Dept Elect Informat & Bioengn, Via Ponzio 34-5, I-20133 Milan, Italy
来源
BMC GENOMIC DATA | 2023年 / 24卷 / 01期
关键词
Data integration; Processed datasets; Tertiary data analysis; Genomics; Multiomics studies; GWAS; ANNOTATION; ONTOLOGY; LANGUAGE; SEQUENCE; SYSTEM; TOOL;
D O I
10.1186/s12863-023-01111-y
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
BackgroundGenome Wide Association Studies (GWAS) are based on the observation of genome-wide sets of genetic variants - typically single-nucleotide polymorphisms (SNPs) - in different individuals that are associated with phenotypic traits. Research efforts have so far been directed to improving GWAS techniques rather than on making the results of GWAS interoperable with other genomic signals; this is currently hindered by the use of heterogeneous formats and uncoordinated experiment descriptions.ResultsTo practically facilitate integrative use, we propose to include GWAS datasets within the META-BASE repository, exploiting an integration pipeline previously studied for other genomic datasets that includes several heterogeneous data types in the same format, queryable from the same systems. We represent GWAS SNPs and metadata by means of the Genomic Data Model and include metadata within a relational representation by extending the Genomic Conceptual Model with a dedicated view. To further reduce the gap with the descriptions of other signals in the repository of genomic datasets, we perform a semantic annotation of phenotypic traits. Our pipeline is demonstrated using two important data sources, initially organized according to different data models: the NHGRI-EBI GWAS Catalog and FinnGen (University of Helsinki). The integration effort finally allows us to use these datasets within multi-sample processing queries that respond to important biological questions. These are then made usable for multi-omic studies together with, e.g., somatic and reference mutation data, genomic annotations, epigenetic signals.ConclusionsAs a result of the our work on GWAS datasets, we enable 1) their interoperable use with several other homogenized and processed genomic datasets in the context of the META-BASE repository; 2) their big data processing by means of the GenoMetric Query Language and associated system. Future large-scale tertiary data analysis may extensively benefit from the addition of GWAS results to inform several different downstream analysis workflows.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] Processing genome-wide association studies within a repository of heterogeneous genomic datasets
    Anna Bernasconi
    Arif Canakoglu
    Federico Comolli
    BMC Genomic Data, 24
  • [2] Correction: Processing genome-wide association studies within a repository of heterogeneous genomic datasets
    Anna Bernasconi
    Arif Canakoglu
    Federico Comolli
    BMC Genomic Data, 24
  • [3] Processing genome-wide association studies within a repository of heterogeneous genomic datasets (vol 24, 13, 2023)
    Bernasconi, Anna
    Canakoglu, Arif
    Comolli, Federico
    BMC GENOMIC DATA, 2023, 24 (SUPPL 1):
  • [4] Interpretation of psychiatric genome-wide association studies with multispecies heterogeneous functional genomic data integration
    Timothy Reynolds
    Emma C. Johnson
    Spencer B. Huggett
    Jason A. Bubier
    Rohan H. C. Palmer
    Arpana Agrawal
    Erich J. Baker
    Elissa J. Chesler
    Neuropsychopharmacology, 2021, 46 : 86 - 97
  • [5] Interpretation of psychiatric genome-wide association studies with multispecies heterogeneous functional genomic data integration
    Reynolds, Timothy
    Johnson, Emma C.
    Huggett, Spencer B.
    Bubier, Jason A.
    Palmer, Rohan H. C.
    Agrawal, Arpana
    Baker, Erich J.
    Chesler, Elissa J.
    NEUROPSYCHOPHARMACOLOGY, 2021, 46 (01) : 86 - 97
  • [6] GenomicLand: Software for genome-wide association studies and genomic prediction
    Azevedo, Camila Ferreira
    Nascimento, Moyses
    Fontes, Vitor Cunha
    Fonseca e Silva, Fabyano
    Vilela de Resende, Marcos Deon
    Cruz, Cosme Damiao
    ACTA SCIENTIARUM-AGRONOMY, 2019, 41
  • [7] Genome-wide association studies
    Nature Reviews Methods Primers, 1
  • [8] Genome-wide association studies
    Willson, Joseph
    NATURE REVIEWS METHODS PRIMERS, 2021, 1 (01):
  • [9] Genome-Wide Association Studies
    Guo, Xiuqing
    Rotter, Jerome I.
    JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2019, 322 (17): : 1705 - 1706
  • [10] Were Genome-Wide Linkage Studies a Waste of Time? Exploiting Candidate Regions Within Genome-Wide Association Studies
    Yoo, Yun J.
    Bull, Shelley B.
    Paterson, Andrew D.
    Waggott, Daryl
    Sun, Lei
    GENETIC EPIDEMIOLOGY, 2010, 34 (02) : 107 - 118