A quality control portal for sequencing data deposited at the European genome-phenome archive

被引:1
|
作者
Fernandez-Orth, Dietmar [1 ]
Rueda, Manuel [2 ]
Singh, Babita [2 ]
Moldes, Mauricio [3 ]
Jene, Aina [3 ]
Ferri, Marta [4 ]
Vasallo, Claudia [2 ]
Fromont, Lauren A. [2 ]
Navarro, Arcadi [5 ,6 ,7 ]
Rambla, Jordi [8 ]
机构
[1] Ctr Genom Regulat CRG, European Genome Phenome Arch EGA, Barcelona, Spain
[2] EGA, Barcelona, Spain
[3] CRG, EGA, Barcelona, Spain
[4] Ctr Genom Regulat CRG, EGA, Barcelona, Spain
[5] Pompeu Fabra Univ UPF, Barcelona, Spain
[6] CRG, EGA Team, Barcelona, Spain
[7] Pasqual Margall Fdn, Barcelona Eta Brain Res Ctr, Barcelona, Spain
[8] CRG, EGA Grp, Barcelona, Spain
关键词
Fastq; quality control; variant call format (VCF); binary alignment map (BAM); European Genome-Phenome Archive (EGA); ALIGNMENT;
D O I
10.1093/bib/bbac136
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Since its launch in 2008, the European Genome-Phenome Archive (EGA) has been leading the archiving and distribution of human identifiable genomic data. In this regard, one of the community concerns is the potential usability of the stored data, as of now, data submitters are not mandated to perform any quality control (QC) before uploading their data and associated metadata information. Here, we present a new File QC Portal developed at EGA, along with QC reports performed and created for 1 694 442 files [Fastq, sequence alignment map (SAM)/binary alignment map (BAM)/CRAM and variant call format (VCF)] submitted at EGA. QC reports allow anonymous EGA users to view summary-level information regarding the files within a specific dataset, such as quality of reads, alignment quality, number and type of variants and other features. Researchers benefit from being able to assess the quality of data prior to the data access decision and thereby, increasing the reusability of data (https://ega-archive.org/blog/data-upcycling-powered-by-ega/).
引用
收藏
页数:5
相关论文
共 50 条
  • [1] The European Genome-phenome Archive in 2021
    Freeberg, Mallory Ann
    Fromont, Lauren A.
    D'Altri, Teresa
    Romero, Anna Foix
    Ciges, Jorge Izquierdo
    Jene, Aina
    Kerry, Giselle
    Moldes, Mauricio
    Ariosa, Roberto
    Bahena, Silvia
    Barrowdale, Daniel
    Barbero, Marcos Casado
    Fernandez-Orth, Dietmar
    Garcia-Linares, Carles
    Garcia-Rios, Emilio
    Haziza, Frederic
    Juhasz, Bela
    Llobet, Oscar Martinez
    Milla, Gemma
    Mohan, Anand
    Rueda, Manuel
    Sankar, Aravind
    Shaju, Dona
    Shimpi, Ashutosh
    Singh, Babita
    Thomas, Coline
    de la Torre, Sabela
    Uyan, Umuthan
    Vasallo, Claudia
    Flicek, Paul
    Guigo, Roderic
    Navarro, Arcadi
    Parkinson, Helen
    Keane, Thomas
    Rambla, Jordi
    [J]. NUCLEIC ACIDS RESEARCH, 2022, 50 (D1) : D980 - D987
  • [2] EGAsubmitter: A software to automate submission of nucleic acid sequencing data to the European Genome-phenome Archive
    Viviani, Marco
    Montemurro, Marilisa
    Trusolino, Livio
    Bertotti, Andrea
    Urgese, Gianvito
    Grassi, Elena
    [J]. FRONTIERS IN BIOINFORMATICS, 2023, 3
  • [3] The European Genome-phenome Archive of human data consented for biomedical research
    Lappalainen, Ilkka
    Almeida-King, Jeff
    Kumanduri, Vasudev
    Senf, Alexander
    Spalding, John Dylan
    Ur-Rehman, Saif
    Saunders, Gary
    Kandasamy, Jag
    Caccamo, Mario
    Leinonen, Rasko
    Vaughan, Brendan
    Laurent, Thomas
    Rowland, Francis
    Marin-Garcia, Pablo
    Barker, Jonathan
    Jokinen, Petteri
    Torres, Angel Carreno
    de Argila, Jordi Rambla
    Llobet, Oscar Martinez
    Medina, Ignacio
    Puy, Marc Sitges
    Alberich, Mario
    de la Torre, Sabela
    Navarro, Arcadi
    Paschall, Justin
    Flicek, Paul
    [J]. NATURE GENETICS, 2015, 47 (07) : 692 - 695
  • [4] The European Genome-phenome Archive of human data consented for biomedical research
    Ilkka Lappalainen
    Jeff Almeida-King
    Vasudev Kumanduri
    Alexander Senf
    John Dylan Spalding
    Saif ur-Rehman
    Gary Saunders
    Jag Kandasamy
    Mario Caccamo
    Rasko Leinonen
    Brendan Vaughan
    Thomas Laurent
    Francis Rowland
    Pablo Marin-Garcia
    Jonathan Barker
    Petteri Jokinen
    Angel Carreño Torres
    Jordi Rambla de Argila
    Oscar Martinez Llobet
    Ignacio Medina
    Marc Sitges Puy
    Mario Alberich
    Sabela de la Torre
    Arcadi Navarro
    Justin Paschall
    Paul Flicek
    [J]. Nature Genetics, 2015, 47 : 692 - 695
  • [5] European Genome-phenome Archive (EGA) - Granular solutions for the next 10 years
    Fernandez-Orth, Dietmar
    Lloret-Villas, Audald
    Rambla de Argila, Jordi
    [J]. 2019 IEEE 32ND INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS (CBMS), 2019, : 4 - 6
  • [6] The European Genome-phenome Archive in 2021 (vol 50, pg D980, 2022)
    不详
    [J]. NUCLEIC ACIDS RESEARCH, 2023, 51 (06) : 2994 - 2994
  • [7] The German Human Genome-Phenome Archive (GHGA) - A national infrastructure for secure archival and community-driven analysis of omics data
    Pages, Anna Benet
    Mertes, Christian
    [J]. EUROPEAN JOURNAL OF HUMAN GENETICS, 2024, 32 : 677 - 678
  • [8] Solving patients with rare diseases through programmatic reanalysis of genome-phenome data
    Leslie Matalonga
    Carles Hernández-Ferrer
    Davide Piscia
    Rebecca Schüle
    Matthis Synofzik
    Ana Töpf
    Lisenka E. L. M. Vissers
    Richarda de Voer
    Raul Tonda
    Steven Laurie
    Marcos Fernandez-Callejo
    Daniel Picó
    Carles Garcia-Linares
    Anastasios Papakonstantinou
    Alberto Corvó
    Ricky Joshi
    Hector Diez
    Ivo Gut
    Alexander Hoischen
    Holm Graessner
    Sergi Beltran
    [J]. European Journal of Human Genetics, 2021, 29 : 1337 - 1347
  • [9] Correction to: Solving patients with rare diseases through programmatic reanalysis of genome-phenome data
    Leslie Matalonga
    Carles Hernández-Ferrer
    Davide Piscia
    Rebecca Schüle
    Matthis Synofzik
    Ana Töpf
    Lisenka E. L. M. Vissers
    Richarda de Voer
    Raul Tonda
    Steven Laurie
    Marcos Fernandez-Callejo
    Daniel Picó
    Carles Garcia-Linares
    Anastasios Papakonstantinou
    Alberto Corvó
    Ricky Joshi
    Hector Diez
    Ivo Gut
    Alexander Hoischen
    Holm Graessner
    Sergi Beltran
    [J]. European Journal of Human Genetics, 2021, 29 : 1466 - 1469
  • [10] Simulated European Genome-phenome Dataset of 1,000,000 Individuals for 1+Million Genomes Initiative
    Hiekkalinna, Tero
    Heikkinen, Vilho
    Perola, Markus
    Terwilliger, Joseph
    [J]. EUROPEAN JOURNAL OF HUMAN GENETICS, 2023, 31 : 645 - 645