Disease-specific prioritization of non-coding GWAS variants based on chromatin accessibility

被引:0
|
作者
Liang, Qianqian [1 ,2 ,3 ]
Abraham, Abin [4 ]
Capra, John A. [5 ,6 ]
Kostka, Dennis [1 ,2 ]
机构
[1] Univ Pittsburgh, Sch Med, Dept Computat & Syst Biol, Pittsburgh, PA 15213 USA
[2] Univ Pittsburgh, Sch Med, Ctr Evolutionary Biol & Med, Pittsburgh, PA 15213 USA
[3] Univ Pittsburgh, Sch Publ Hlth, Dept Human Genet, Pittsburgh, PA USA
[4] Childrens Hosp Philadelphia, Philadelphia, PA USA
[5] Univ Calif San Francisco, Dept Epidemiol & Biostat, San Francisco, CA USA
[6] Univ Calif San Francisco, Bakar Computat Hlth Sci Inst, San Francisco, CA USA
来源
关键词
CELLS; ASSOCIATION; ANNOTATION; FRAMEWORK; INNATE;
D O I
10.1016/j.xhgg.2024.100310
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Non-protein-coding genetic variants are a major driver of the genetic risk for human disease; however, identifying which non-coding variants contribute to diseases and their mechanisms remains challenging. In silico variant prioritization methods quantify a variant's severity, but for most methods, the specific phenotype and disease context of the prediction remain poorly defined. For example, many commonly used methods provide a single, organism-wide score for each variant, while other methods summarize a variant's impact in certain tissues and/or cell types. Here, we propose a complementary disease-specific variant prioritization scheme, which is motivated by the observation that variants contributing to disease often operate through specific biological mechanisms. We combine tissue/cell-type-specific variant scores (e.g., GenoSkyline, FitCons2, DNA accessibility) into disease-specific scores with a logistic regression approach and apply it to - 25,000 non-coding variants spanning 111 diseases. We show that this disease-specific aggregation significantly improves the association of common non-coding genetic variants with disease (average precision: 0.151, baseline = 0.09), compared with organism-wide scores (GenoCanyon, LINSIGHT, GWAVA, Eigen, CADD; average precision: 0.129, baseline = 0.09). Further on, disease similarities based on data-driven aggregation weights highlight meaningful disease groups, and it provides information about tissues and cell types that drive these similarities. We also show that so-learned similarities are complementary to genetic similarities as quantified by genetic correlation. Overall, our approach demonstrates the strengths of disease-specific variant prioritization, leads to improvement in non-coding variant prioritization, and enables interpretable models that link variants to disease via specific tissues and/or cell types.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] RegVar: Tissue-specific Prioritization of Non-coding Regulatory Variants
    Lu, Hao
    Ma, Luyu
    Quan, Cheng
    Li, Lei
    Lu, Yiming
    Zhou, Gangqiao
    Zhang, Chenggang
    GENOMICS PROTEOMICS & BIOINFORMATICS, 2023, 21 (02) : 385 - 395
  • [2] Prioritization of non-coding disease-causing variants and long non-coding RNAs in liver cancer
    Li, Hua
    He, Zekun
    Gu, Yang
    Fang, Lin
    Lv, Xin
    ONCOLOGY LETTERS, 2016, 12 (05) : 3987 - 3994
  • [3] Context-specific prioritization of non-coding variants implicated in human diseases
    Moyon, L.
    Berthelot, C.
    Crollius, H. Roest
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2019, 27 : 596 - 596
  • [4] Systematic analysis of the effects of genetic variants on chromatin accessibility to decipher functional variants in non-coding regions
    Wang, Dongyang
    Wu, Xiaohong
    Jiang, Guanghui
    Yang, Jianye
    Yu, Zhanhui
    Yang, Yanbo
    Yang, Wenqian
    Niu, Xiaohui
    Tang, Ke
    Gong, Jing
    FRONTIERS IN ONCOLOGY, 2022, 12
  • [5] DIVAN: accurate identification of non-coding disease-specific risk variants using multi-omics profiles
    Chen, Li
    Jin, Peng
    Qin, Zhaohui S.
    GENOME BIOLOGY, 2016, 17
  • [6] DIVAN: accurate identification of non-coding disease-specific risk variants using multi-omics profiles
    Li Chen
    Peng Jin
    Zhaohui S. Qin
    Genome Biology, 17
  • [7] Prioritization of regulatory variants with tissue-specific function in the non-coding regions of human genome
    Dong, Shengcheng
    Boyle, Alan P.
    NUCLEIC ACIDS RESEARCH, 2022, 50 (01)
  • [8] Demystifying non-coding GWAS variants: an overview of computational tools and methods
    Schipper, Marijn
    Posthuma, Danielle
    HUMAN MOLECULAR GENETICS, 2022, 31 (R1) : R73 - R83
  • [9] Principles and methods of in-silico prioritization of non-coding regulatory variants
    Lee, Phil H.
    Lee, Christian
    Li, Xihao
    Wee, Brian
    Dwivedi, Tushar
    Daly, Mark
    HUMAN GENETICS, 2018, 137 (01) : 15 - 30
  • [10] Principles and methods of in-silico prioritization of non-coding regulatory variants
    Phil H. Lee
    Christian Lee
    Xihao Li
    Brian Wee
    Tushar Dwivedi
    Mark Daly
    Human Genetics, 2018, 137 : 15 - 30