Testing for association with rare variants in the coding and non-coding genome: RAVA-FIRST, a new approach based on CADD deleteriousness score

被引:5
|
作者
Bocher, Ozvan [1 ,2 ]
Ludwig, Thomas E. [1 ,3 ]
Oglobinsky, Marie-Sophie [1 ]
Marenne, Gaeelle [1 ]
Deleuze, Jean-Francois [4 ]
Suryakant, Suryakant [5 ]
Odeberg, Jacob [6 ,7 ]
Morange, Pierre-Emmanuel [8 ]
Tregoueet, David-Alexandre [5 ]
Perdry, Herve [9 ]
Genin, Emmanuelle [1 ,3 ]
机构
[1] Univ Brest, INSERM, EFS, UMR 1078,GGB, Brest, France
[2] Helmholtz Zentrum Munchen, Inst Translat Genom, Munich, Germany
[3] CHU Brest, Brest, France
[4] Univ Paris Saclay, Ctr Natl Rech Genom Humaine CNRGH, Inst Biol Francois Jacob, CEA, Evry, France
[5] Univ Bordeaux, INSERM, Bordeaux Populat Hlth Res Ctr, Team ELEANOR,UMR 1219, Bordeaux, France
[6] KTH Royal Inst Technol, Sci Life Lab, Dept Prot Sci, CBH, Stockholm, Sweden
[7] Arctic Univ Tromso, Dept Clin Med, Fac Hlth Sci, Tromso, Norway
[8] Aix Marseille Univ, INSERM, INRAE, C2VN, Marseille, France
[9] Univ Paris Saclay, Univ Paris Sud, UFR Med, CESP Inserm,U1018, Villejuif, France
来源
PLOS GENETICS | 2022年 / 18卷 / 09期
关键词
EXPRESSION; ADHESION; REGIONS; CD226;
D O I
10.1371/journal.pgen.1009923
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Rare variant association tests (RVAT) have been developed to study the contribution of rare variants widely accessible through high-throughput sequencing technologies. RVAT require to aggregate rare variants in testing units and to filter variants to retain only the most likely causal ones. In the exome, genes are natural testing units and variants are usually filtered based on their functional consequences. However, when dealing with whole-genome sequence (WGS) data, both steps are challenging. No natural biological unit is available for aggregating rare variants. Sliding windows procedures have been proposed to circumvent this difficulty, however they are blind to biological information and result in a large number of tests. We propose a new strategy to perform RVAT on WGS data: "RAVA-FIRST" (RAre Variant Association using Functionally-InfoRmed STeps) comprising three steps. (1) New testing units are defined genome-wide based on functionally-adjusted Combined Annotation Dependent Depletion (CADD) scores of variants observed in the gnomAD populations, which are referred to as "CADD regions". (2) A region-dependent filtering of rare variants is applied in each CADD region. (3) A functionally-informed burden test is performed with sub-scores computed for each genomic category within each CADD region. Both on simulations and real data, RAVA-FIRST was found to outperform other WGS-based RVAT. Applied to a WGS dataset of venous thromboembolism patients, we identified an intergenic region on chromosome 18 enriched for rare variants in early-onset patients. This region that was missed by standard sliding windows procedures is included in a TAD region that contains a strong candidate gene. RAVA-FIRST enables new investigations of rare non-coding variants in complex diseases, facilitated by its implementation in the R package Ravages.
引用
收藏
页数:19
相关论文
共 17 条
  • [1] Rare variant association testing in the non-coding genome
    Bocher, Ozvan
    Genin, Emmanuelle
    HUMAN GENETICS, 2020, 139 (11) : 1345 - 1362
  • [2] Rare variant association testing in the non-coding genome
    Ozvan Bocher
    Emmanuelle Génin
    Human Genetics, 2020, 139 : 1345 - 1362
  • [3] Whole genome sequencing of patients with haematological, immune and haemostatic disorders reveals hundreds of new variants in the coding and non-coding part of the genome causal of rare diseases
    Sivapalaratnam, S.
    Lentaigne, C.
    Turro, E.
    BRITISH JOURNAL OF HAEMATOLOGY, 2019, 185 : 31 - 31
  • [4] Openness weighted association studies: leveraging personal genome information to prioritize non-coding variants
    Song, Shuang
    Shan, Nayang
    Wang, Geng
    Yan, Xiting
    Liu, Jun S.
    Hou, Lin
    BIOINFORMATICS, 2021, 37 (24) : 4737 - 4743
  • [5] Rare variants in non-coding regulatory regions of the genome that affect gene expression in systemic lupus erythematosus
    Sarah A. Jones
    Stuart Cantsilieris
    Huapeng Fan
    Qiang Cheng
    Brendan E. Russ
    Elena J. Tucker
    James Harris
    Ina Rudloff
    Marcel Nold
    Melissa Northcott
    Wendy Dankers
    Andrew E. J. Toh
    Stefan J. White
    Eric F. Morand
    Scientific Reports, 9
  • [6] Rare variants in non-coding regulatory regions of the genome that affect gene expression in systemic lupus erythematosus
    Jones, Sarah A.
    Cantsilieris, Stuart
    Fan, Huapeng
    Cheng, Qiang
    Russ, Brendan E.
    Tucker, Elena J.
    Harris, James
    Rudloff, Ina
    Nold, Marcel
    Northcott, Melissa
    Dankers, Wendy
    Toh, Andrew E. J.
    White, Stefan J.
    Morand, Eric F.
    SCIENTIFIC REPORTS, 2019, 9 (1)
  • [7] Structural and non-coding variants increase the diagnostic yield of clinical whole genome sequencing for rare diseases
    Pagnamenta, Alistair T.
    Camps, Carme
    Giacopuzzi, Edoardo
    Taylor, John M.
    Hashim, Mona
    Calpena, Eduardo
    Kaisaki, Pamela J.
    Hashimoto, Akiko
    Yu, Jing
    Sanders, Edward
    Schwessinger, Ron
    Hughes, Jim R.
    Lunter, Gerton
    Dreau, Helene
    Ferla, Matteo
    Lange, Lukas
    Kesim, Yesim
    Ragoussis, Vassilis
    Vavoulis, Dimitrios V.
    Allroggen, Holger
    Ansorge, Olaf
    Babbs, Christian
    Banka, Siddharth
    Banos-Pinero, Benito
    Beeson, David
    Ben-Ami, Tal
    Bennett, David L.
    Bento, Celeste
    Blair, Edward
    Brasch-Andersen, Charlotte
    Bull, Katherine R.
    Cario, Holger
    Cilliers, Deirdre
    Conti, Valerio
    Davies, E. Graham
    Dhalla, Fatima
    Dacal, Beatriz Diez
    Dong, Yin
    Dunford, James E.
    Guerrini, Renzo
    Harris, Adrian L.
    Hartley, Jane
    Hollander, Georg
    Javaid, Kassim
    Kane, Maureen
    Kelly, Deirdre
    Kelly, Dominic
    Knight, Samantha J. L.
    Kreins, Alexandra Y.
    Kvikstad, Erika M.
    GENOME MEDICINE, 2023, 15 (01)
  • [8] Structural and non-coding variants increase the diagnostic yield of clinical whole genome sequencing for rare diseases
    Alistair T. Pagnamenta
    Carme Camps
    Edoardo Giacopuzzi
    John M. Taylor
    Mona Hashim
    Eduardo Calpena
    Pamela J. Kaisaki
    Akiko Hashimoto
    Jing Yu
    Edward Sanders
    Ron Schwessinger
    Jim R. Hughes
    Gerton Lunter
    Helene Dreau
    Matteo Ferla
    Lukas Lange
    Yesim Kesim
    Vassilis Ragoussis
    Dimitrios V. Vavoulis
    Holger Allroggen
    Olaf Ansorge
    Christian Babbs
    Siddharth Banka
    Benito Baños-Piñero
    David Beeson
    Tal Ben-Ami
    David L. Bennett
    Celeste Bento
    Edward Blair
    Charlotte Brasch-Andersen
    Katherine R. Bull
    Holger Cario
    Deirdre Cilliers
    Valerio Conti
    E. Graham Davies
    Fatima Dhalla
    Beatriz Diez Dacal
    Yin Dong
    James E. Dunford
    Renzo Guerrini
    Adrian L. Harris
    Jane Hartley
    Georg Hollander
    Kassim Javaid
    Maureen Kane
    Deirdre Kelly
    Dominic Kelly
    Samantha J. L. Knight
    Alexandra Y. Kreins
    Erika M. Kvikstad
    Genome Medicine, 15
  • [9] PSAP-Genomic-Regions: A Method Leveraging Population Data to Prioritize Coding and Non-Coding Variants in Whole Genome Sequencing for Rare Disease Diagnosis
    Ogloblinsky, Marie-Sophie C.
    Bocher, Ozvan
    Aloui, Chaker
    Leutenegger, Anne-Louise
    Ozisik, Ozan
    Baudot, Anais
    Tournier-Lasserve, Elisabeth
    Castillo-Madeen, Helen
    Lewinsohn, Daniel
    Conrad, Donald F.
    Genin, Emmanuelle
    Marenne, Gaelle
    GENETIC EPIDEMIOLOGY, 2025, 49 (01)
  • [10] Leveraging Healthy Population Data to Assess the Pathogenicity of Rare Variants in WGS: Extension of PSAP Method to the Non-coding Genome
    Ogloblinsky, Marie-Sophie C.
    Bocher, Ozvan
    Aloui, Chaker
    Tournier-Lasserve, Elisabeth
    Conrad, Donald F.
    Genin, Emmanuelle
    Marenne, Gaelle
    GENETIC EPIDEMIOLOGY, 2022, 46 (07) : 521 - 521