Fast, scalable prediction of deleterious noncoding variants from functional and population genomic data

被引:207
|
作者
Huang, Yi-Fei [1 ]
Gulko, Brad [1 ,2 ]
Siepel, Adam [1 ]
机构
[1] Simons Ctr Quantitat Biol, Cold Spring Harbor Lab, Cold Spring Harbor, NY 11724 USA
[2] Cornell Univ, Grad Field Comp Sci, Ithaca, NY USA
基金
美国国家卫生研究院;
关键词
UNIFIED ARCHITECTURE; CONSERVED ELEMENTS; NATURAL-SELECTION; SEQUENCE; SITES; DNA; PATHOGENICITY; ANNOTATIONS; PROMOTERS; EVOLUTION;
D O I
10.1038/ng.3810
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Many genetic variants that influence phenotypes of interest are located outside of protein-coding genes, yet existing methods for identifying such variants have poor predictive power. Here we introduce a new computational method, called LINSIGHT, that substantially improves the prediction of noncoding nucleotide sites at which mutations are likely to have deleterious fitness consequences, and which, therefore, are likely to be phenotypically important. LINSIGHT combines a generalized linear model for functional genomic data with a probabilistic model of molecular evolution. The method is fast and highly scalable, enabling it to exploit the 'big data' available in modern genomics. We show that LINSIGHT outperforms the best available methods in identifying human noncoding variants associated with inherited diseases. In addition, we apply LINSIGHT to an atlas of human enhancers and show that the fitness consequences at enhancers depend on cell type, tissue specificity, and constraints at associated promoters.
引用
收藏
页码:618 / +
页数:9
相关论文
共 50 条
  • [21] Genomic prediction based on selected variants from imputed whole-genome sequence data in Australian sheep populations
    Nasir Moghaddar
    Majid Khansefid
    Julius H. J. van der Werf
    Sunduimijid Bolormaa
    Naomi Duijvesteijn
    Samuel A. Clark
    Andrew A. Swan
    Hans D. Daetwyler
    Iona M. MacLeod
    Genetics Selection Evolution, 51
  • [22] An unsupervised deep learning framework for predicting human essential genes from population and functional genomic data
    Troy M. LaPolice
    Yi-Fei Huang
    BMC Bioinformatics, 24
  • [23] Data Access and Use: From Rosbreed Data Management to Genomic Prediction
    Main, Dorrie
    Jung, Sook
    Peace, Cameron
    Bassil, Nahla
    Hardner, Craig M.
    McFerson, James R.
    Iezzoni, Amy F.
    Lee, Taein
    Cheng, Chun-Huai
    Hough, Heidi
    Luby, James J.
    Gasic, Ksenija
    Humann, Jodi L.
    Edge-Garza, Daniel
    Zurn, Jason
    DeVetter, Lisa Wasko
    Evans, Kate
    Sebolt, Audrey
    Vanderzande, Stijn
    Coe, Michael
    HORTSCIENCE, 2019, 54 (09) : S4 - S5
  • [24] An unsupervised deep learning framework for predicting human essential genes from population and functional genomic data
    Lapolice, Troy M.
    Huang, Yi-Fei
    BMC BIOINFORMATICS, 2023, 24 (01)
  • [25] Analysis of Population Genomic Data from Hybrid Zones
    Gompert, Zachariah
    Mandeville, Elizabeth G.
    Buerkle, C. Alex
    ANNUAL REVIEW OF ECOLOGY, EVOLUTION, AND SYSTEMATICS, VOL 48, 2017, 48 : 207 - 229
  • [26] SparkSeq: fast, scalable and cloud-ready tool for the interactive genomic data analysis with nucleotide precision
    Wiewiorka, Marek S.
    Messina, Antonio
    Pacholewska, Alicja
    Maffioletti, Sergio
    Gawrysiak, Piotr
    Okoniewski, Michal J.
    BIOINFORMATICS, 2014, 30 (18) : 2652 - 2653
  • [27] Reliable Identification of Genomic Variants from RNA-Seq Data
    Piskol, Robert
    Ramaswami, Gokul
    Li, Jin Billy
    AMERICAN JOURNAL OF HUMAN GENETICS, 2013, 93 (04) : 641 - 651
  • [28] A framework for automated scalable designation of viral pathogen lineages from genomic data
    McBroome, Jakob
    Schneider, Adriano de Bernardi
    Roemer, Cornelius
    Wolfinger, Michael T.
    Hinrichs, Angie S.
    O'Toole, Aine Niamh
    Ruis, Christopher
    Turakhia, Yatish
    Rambaut, Andrew
    Corbett-Detig, Russell
    NATURE MICROBIOLOGY, 2024, 9 (2) : 550 - 560
  • [29] A framework for automated scalable designation of viral pathogen lineages from genomic data
    Jakob McBroome
    Adriano de Bernardi Schneider
    Cornelius Roemer
    Michael T. Wolfinger
    Angie S. Hinrichs
    Aine Niamh O’Toole
    Christopher Ruis
    Yatish Turakhia
    Andrew Rambaut
    Russell Corbett-Detig
    Nature Microbiology, 2024, 9 : 550 - 560
  • [30] Exploiting deep transfer learning for the prediction of functional non-coding variants using genomic sequence
    Chen, Li
    Wang, Ye
    Zhao, Fengdi
    BIOINFORMATICS, 2022, 38 (12) : 3164 - 3172