Incorporating ENCODE information into association analysis of whole genome sequencing data

被引:0
|
作者
Kim T. [1 ]
Wei P. [1 ]
机构
[1] Department of Biostatistics, University of Texas MD Anderson Cancer Center, Houston, 77030, TX
基金
美国国家卫生研究院;
关键词
Whole Genome Sequencing; Whole Exome Sequencing; Nonsynonymous Variant; Whole Genome Sequencing Data; Sequence Kernel Association Test;
D O I
10.1186/s12919-016-0040-y
中图分类号
学科分类号
摘要
With the rapidly decreasing cost of the next-generation sequencing technology, a large number of whole genome sequences have been generated, enabling researchers to survey rare variants in the protein-coding and regulatory regions of the genome. However, it remains a daunting task to identify functional variants associated with complex diseases from whole genome sequencing (WGS) data because of the millions of candidate variants and yet moderate sample size. We propose to incorporate the Encyclopedia of DNA Elements (ENCODE) information in the association analysis of WGS data to boost the statistical power. We use the RegulomeDB and PolyPhen2 scores as external weights in existing rare variants association tests. We demonstrate the proposed framework using the WGS data and blood pressure phenotype from the San Antonio Family Studies provided by the Genetic Analysis Workshop 19. We identified a genome-wide significant locus in gene SNUPN on chromosome 15 that harbors a rare nonsynonymous variant, which was not detected by benchmark methods that did not incorporate biological information, including the T5 burden test and sequence kernel association test. © 2016 The Author(s).
引用
收藏
相关论文
共 50 条
  • [1] Incorporating biological information into association studies of sequencing data
    Chen, Gary
    Wei, Peng
    DeStefano, Anita L.
    GENETIC EPIDEMIOLOGY, 2011, 35 : S29 - S34
  • [2] Association analysis of whole genome sequencing data accounting for longitudinal and family designs
    Yijuan Hu
    Qin Hui
    Yan V Sun
    BMC Proceedings, 8 (Suppl 1)
  • [3] Extrapolating ENCODE data to the whole human genome
    Costantini, Maria
    Di Filippo, Miriam
    Bernardi, Giorgio
    GENE, 2008, 419 (1-2) : 66 - 69
  • [4] ENCODE whole-genome data in the UCSC Genome Browser
    Rosenbloom, Kate R.
    Dreszer, Timothy R.
    Pheasant, Michael
    Barber, Galt P.
    Meyer, Laurence R.
    Pohl, Andy
    Raney, Brian J.
    Wang, Ting
    Hinrichs, Angie S.
    Zweig, Ann S.
    Fujita, Pauline A.
    Learned, Katrina
    Rhead, Brooke
    Smith, Kayla E.
    Kuhn, Robert M.
    Karolchik, Donna
    Haussler, David
    Kent, W. James
    NUCLEIC ACIDS RESEARCH, 2010, 38 : D620 - D625
  • [5] A systems biology analysis for the whole genome sequencing data
    Jhamb, Deepali
    Pradhan, Meeta P.
    Desai, Akshay
    Palakal, Mathew J.
    Duraiswamy, Premkumar
    2014 IEEE 4TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL ADVANCES IN BIO AND MEDICAL SCIENCES (ICCABS), 2014,
  • [6] Alternative Test Statistics for Sparse Data in Genome-wide Association and Whole-genome Sequencing Analysis
    Bull, Shelley B.
    Rotondi, Michael A.
    GENETIC EPIDEMIOLOGY, 2012, 36 (02) : 137 - 138
  • [7] An information-theoretic approach to the modeling and analysis of whole-genome bisulfite sequencing data
    Jenkinson, Garrett
    Abante, Jordi
    Feinberg, Andrew P.
    Goutsias, John
    BMC BIOINFORMATICS, 2018, 19
  • [8] An information-theoretic approach to the modeling and analysis of whole-genome bisulfite sequencing data
    Garrett Jenkinson
    Jordi Abante
    Andrew P. Feinberg
    John Goutsias
    BMC Bioinformatics, 19
  • [9] A goodness-of-fit association test for whole genome sequencing data
    Li Yang
    Jing Xuan
    Zheyang Wu
    BMC Proceedings, 8 (Suppl 1)
  • [10] Whole genome sequencing data increases power and precision to genome wide association studies
    Hoglund, J.
    Rafati, N.
    Rask-Andersen, M.
    Karlsson, T.
    Ek, W. E.
    Johansson, A.
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2019, 27 : 1745 - 1745