TVAR: assessing tissue-specific functional effects of non-coding variants with deep learning

被引:2
|
作者
Yang, Hai [1 ,2 ]
Chen, Rui [2 ,3 ]
Wang, Quan [2 ,3 ]
Wei, Qiang [2 ,3 ]
Ji, Ying [2 ,3 ]
Zhong, Xue [3 ,4 ]
Li, Bingshan [2 ,3 ]
机构
[1] East China Univ Sci & Technol, Dept Comp Sci & Engn, Shanghai 200237, Peoples R China
[2] Vanderbilt Univ, Dept Mol Physiol & Biophys, Nashville, TN 37232 USA
[3] Vanderbilt Univ, Vanderbilt Genet Inst, Nashville, TN 37232 USA
[4] Vanderbilt Univ, Med Ctr, Dept Med, Nashville, TN 37232 USA
基金
美国国家卫生研究院;
关键词
INTEGRATIVE ANALYSIS; REGULATORY VARIANTS; PATHOGENICITY; FRAMEWORK; ELEMENTS; MODEL;
D O I
10.1093/bioinformatics/btac608
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Analysis of whole-genome sequencing (WGS) for genetics is still a challenge due to the lack of accurate functional annotation of non-coding variants, especially the rare ones. As eQTLs have been extensively implicated in the genetics of human diseases, we hypothesize that rare non-coding variants discovered in WGS play a regulatory role in predisposing disease risk. Results: With thousands of tissue- and cell-type-specific epigenomic features, we propose TVAR. This multi-label learning-based deep neural network predicts the functionality of non-coding variants in the genome based on eQTLs across 49 human tissues in the GTEx project. TVAR learns the relationships between high-dimensional epigenomics and eQTLs across tissues, taking the correlation among tissues into account to understand shared and tissue-specific eQTL effects. As a result, TVAR outputs tissue-specific annotations, with an average AUROC of 0.77 across these tissues. We evaluate TVAR's performance on four complex diseases (coronary artery disease, breast cancer, Type 2 diabetes and Schizophrenia), using TVAR's tissue-specific annotations, and observe its superior performance in predicting functional variants for both common and rare variants, compared with five existing state-of-the-art tools. We further evaluate TVAR's G-score, a scoring scheme across all tissues, on ClinVar, fine-mapped GWAS loci, Massive Parallel Reporter Assay (MPRA) validated variants and observe the consistently better performance of TVAR compared with other competing tools. Availability and implementation: The TVAR source code and its scores on the ClinVar catalog, fine mapped GWAS Loci, high confidence eQTLs from GTEx dataset, and MPRA validated functional variants are available at https:// github.com/haiyang1986/TVAR.
引用
收藏
页码:4697 / 4704
页数:8
相关论文
共 50 条
  • [1] TiSAn: estimating tissue-specific effects of coding and non-coding variants
    Vervier, Kevin
    Michaelson, Jacob J.
    BIOINFORMATICS, 2018, 34 (18) : 3061 - 3068
  • [2] RegVar: Tissue-specific Prioritization of Non-coding Regulatory Variants
    Lu, Hao
    Ma, Luyu
    Quan, Cheng
    Li, Lei
    Lu, Yiming
    Zhou, Gangqiao
    Zhang, Chenggang
    GENOMICS PROTEOMICS & BIOINFORMATICS, 2023, 21 (02) : 385 - 395
  • [3] Prioritization of regulatory variants with tissue-specific function in the non-coding regions of human genome
    Dong, Shengcheng
    Boyle, Alan P.
    NUCLEIC ACIDS RESEARCH, 2022, 50 (01)
  • [4] Annotating functional effects of non-coding variants in neuropsychiatric cell types by deep transfer learning
    Lai, Boqiao
    Qian, Sheng
    Zhang, Hanwei
    Zhang, Siwei
    Kozlova, Alena
    Duan, Jubao
    Xu, Jinbo
    He, Xin
    PLOS COMPUTATIONAL BIOLOGY, 2022, 18 (05)
  • [5] Identification and characterization of human non-coding RNAs with tissue-specific expression
    Sasaki, Yasnory T. F.
    Sano, Miho
    Ideue, Takashi
    Kin, Taishin
    Asai, Kiyoshi
    Hirose, Tetsuro
    BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2007, 357 (04) : 991 - 996
  • [6] Exploiting deep transfer learning for the prediction of functional non-coding variants using genomic sequence
    Chen, Li
    Wang, Ye
    Zhao, Fengdi
    BIOINFORMATICS, 2022, 38 (12) : 3164 - 3172
  • [7] Non-coding variants disrupting a tissue-specific regulatory element in HK1 cause congenital hyperinsulinism
    Matthew N. Wakeling
    Nick D. L. Owens
    Jessica R. Hopkinson
    Matthew B. Johnson
    Jayne A. L. Houghton
    Antonia Dastamani
    Christine S. Flaxman
    Rebecca C. Wyatt
    Thomas I. Hewat
    Jasmin J. Hopkins
    Thomas W. Laver
    Rachel van Heugten
    Michael N. Weedon
    Elisa De Franco
    Kashyap A. Patel
    Sian Ellard
    Noel G. Morgan
    Edmund Cheesman
    Indraneel Banerjee
    Andrew T. Hattersley
    Mark J. Dunne
    Sarah J. Richardson
    Sarah E. Flanagan
    Nature Genetics, 2022, 54 : 1615 - 1620
  • [8] Non-coding variants disrupting a tissue-specific regulatory element in HK1 cause congenital hyperinsulinism
    Wakeling, Matthew N.
    Owens, Nick D. L.
    Hopkinson, Jessica R.
    Johnson, Matthew B.
    Houghton, Jayne A. L.
    Dastamani, Antonia
    Flaxman, Christine S.
    Wyatt, Rebecca C.
    Hewat, Thomas, I
    Hopkins, Jasmin J.
    Laver, Thomas W.
    van Heugten, Rachel
    Weedon, Michael N.
    De Franco, Elisa
    Patel, Kashyap A.
    Ellard, Sian
    Morgan, Noel G.
    Cheesman, Edmund
    Banerjee, Indraneel
    Hattersley, Andrew T.
    Dunne, Mark J.
    Richardson, Sarah J.
    Flanagan, Sarah E.
    NATURE GENETICS, 2022, 54 (11) : 1615 - +
  • [9] Combining artificial intelligence: deep learning with Hi-C data to predict the functional effects of non-coding variants
    Meng, Xiang-He
    Xiao, Hong-Mei
    Deng, Hong-Wen
    BIOINFORMATICS, 2021, 37 (10) : 1339 - 1344
  • [10] Tissue-specific Co-expression of Long Non-coding and Coding RNAs Associated with Breast Cancer
    Wu, Wenting
    Wagner, Erin K.
    Hao, Yangyang
    Rao, Xi
    Dai, Hongji
    Han, Jiali
    Chen, Jinhui
    Storniolo, Anna Maria V.
    Liu, Yunlong
    He, Chunyan
    SCIENTIFIC REPORTS, 2016, 6