Disease-specific prioritization of non-coding GWAS variants based on chromatin accessibility

被引:0
|
作者
Liang, Qianqian [1 ,2 ,3 ]
Abraham, Abin [4 ]
Capra, John A. [5 ,6 ]
Kostka, Dennis [1 ,2 ]
机构
[1] Univ Pittsburgh, Sch Med, Dept Computat & Syst Biol, Pittsburgh, PA 15213 USA
[2] Univ Pittsburgh, Sch Med, Ctr Evolutionary Biol & Med, Pittsburgh, PA 15213 USA
[3] Univ Pittsburgh, Sch Publ Hlth, Dept Human Genet, Pittsburgh, PA USA
[4] Childrens Hosp Philadelphia, Philadelphia, PA USA
[5] Univ Calif San Francisco, Dept Epidemiol & Biostat, San Francisco, CA USA
[6] Univ Calif San Francisco, Bakar Computat Hlth Sci Inst, San Francisco, CA USA
来源
关键词
CELLS; ASSOCIATION; ANNOTATION; FRAMEWORK; INNATE;
D O I
10.1016/j.xhgg.2024.100310
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Non-protein-coding genetic variants are a major driver of the genetic risk for human disease; however, identifying which non-coding variants contribute to diseases and their mechanisms remains challenging. In silico variant prioritization methods quantify a variant's severity, but for most methods, the specific phenotype and disease context of the prediction remain poorly defined. For example, many commonly used methods provide a single, organism-wide score for each variant, while other methods summarize a variant's impact in certain tissues and/or cell types. Here, we propose a complementary disease-specific variant prioritization scheme, which is motivated by the observation that variants contributing to disease often operate through specific biological mechanisms. We combine tissue/cell-type-specific variant scores (e.g., GenoSkyline, FitCons2, DNA accessibility) into disease-specific scores with a logistic regression approach and apply it to - 25,000 non-coding variants spanning 111 diseases. We show that this disease-specific aggregation significantly improves the association of common non-coding genetic variants with disease (average precision: 0.151, baseline = 0.09), compared with organism-wide scores (GenoCanyon, LINSIGHT, GWAVA, Eigen, CADD; average precision: 0.129, baseline = 0.09). Further on, disease similarities based on data-driven aggregation weights highlight meaningful disease groups, and it provides information about tissues and cell types that drive these similarities. We also show that so-learned similarities are complementary to genetic similarities as quantified by genetic correlation. Overall, our approach demonstrates the strengths of disease-specific variant prioritization, leads to improvement in non-coding variant prioritization, and enables interpretable models that link variants to disease via specific tissues and/or cell types.
引用
收藏
页数:21
相关论文
共 50 条
  • [31] Myocardial Long Non-coding RNA Expression Exhibits Chamber- and Disease-Specific Signatures in Human Right Ventricle
    Yang, Kai-Chien
    Di Salvo, Thomas G.
    CIRCULATION, 2014, 130
  • [32] Open Sesame: Open Chromatin Regions Shed Light onto Non-coding Risk Variants
    Yang, Kun
    Sawa, Akira
    CELL STEM CELL, 2017, 21 (03) : 285 - 287
  • [33] Lineage-Specific Genome Architecture Links Enhancers and Non-coding Disease Variants to Target Gene Promoters
    Javierre, Biola M.
    Burren, Oliver S.
    Wilder, Steven P.
    Kreuzhuber, Roman
    Hill, Steven M.
    Sewitz, Sven
    Cairns, Jonathan
    Wingett, Steven W.
    Varnai, Csilla
    Thiecke, Michiel J.
    Burden, Frances
    Farrow, Samantha
    Cutler, Antony J.
    Rehnstrom, Karola
    Downes, Kate
    Grassi, Luigi
    Kostadima, Myrto
    Freire-Pritchett, Paula
    Wang, Fan
    Stunnenberg, Hendrik G.
    Todd, John A.
    Zerbino, Daniel R.
    Stegle, Oliver
    Ouwehand, Willem H.
    Frontini, Mattia
    Wallace, Chris
    Spivakov, Mikhail
    Fraser, Peter
    CELL, 2016, 167 (05) : 1369 - +
  • [34] Coding and non-coding glucocerebrosidase variants have an impact on cognitive decline in Parkinson's disease
    Schulte, C.
    Liepelt-Scarfone, I.
    Hagen, C. E.
    Hauser, A. -K.
    Brockmann, K.
    Gasser, T.
    Schulz, J. B.
    Reetz, K.
    Graeber, S.
    Mollenhauer, B.
    Trenkwalder, C.
    Witt, K.
    Schmidt, N.
    Dodel, R.
    Balzer-Geldsetzer, M.
    Wuellner, U.
    Klockgether, T.
    Spottke, A.
    Storch, A.
    Wittchen, H. -U.
    Riedel, O.
    Baudrexel, S.
    Kalbe, E.
    Berg, D.
    Mielke, M. M.
    MOVEMENT DISORDERS, 2016, 31 : S218 - S219
  • [35] Role of non-coding sequence variants in cancer
    Ekta Khurana
    Yao Fu
    Dimple Chakravarty
    Francesca Demichelis
    Mark A. Rubin
    Mark Gerstein
    Nature Reviews Genetics, 2016, 17 : 93 - 108
  • [36] Triaging risk variants in the non-coding genome
    Koch, Linda
    NATURE REVIEWS GENETICS, 2014, 15 (12) : 779 - 779
  • [37] Role of Non-Coding Variants in Brugada Syndrome
    Perez-Agustin, Adrian
    Pinsach-Abuin, Mel Lina
    Pagans, Sara
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2020, 21 (22) : 1 - 19
  • [38] Exploration of Coding and Non-coding Variants in Cancer Using GenomePaint
    Zhou, Xin
    Wang, Jian
    Patel, Jaimin
    Valentine, Marc
    Shao, Ying
    Newman, Scott
    Sioson, Edgar
    Tian, Liqing
    Liu, Yu
    Brady, Samuel W.
    Flasch, Diane
    Ma, Xiaotu
    Liu, Yanling
    Paul, Robin
    Edmonson, Michael N.
    Rusch, Michael C.
    Li, Chunliang
    Baker, Suzanne J.
    Easton, John
    Zhang, Jinghui
    CANCER CELL, 2021, 39 (01) : 83 - +
  • [39] PRIORITISING CAUSAL VARIANTS IN SCHIZOPHRENIA GWAS BY INVESTIGATING THE REGULATORY IMPACT OF NON-CODING VARIANTS USING DIFFERENTIAL BINDING AND BRAIN EQTL DATA
    Mowry, Bryan
    Periyasamy, Sathish
    EUROPEAN NEUROPSYCHOPHARMACOLOGY, 2019, 29 : S263 - S264
  • [40] Triaging risk variants in the non-coding genome
    Linda Koch
    Nature Reviews Genetics, 2014, 15 : 779 - 779