Analysis of protein-coding genetic variation in 60,706 humans

被引:0
|
作者
Monkol Lek
Konrad J. Karczewski
Eric V. Minikel
Kaitlin E. Samocha
Eric Banks
Timothy Fennell
Anne H. O’Donnell-Luria
James S. Ware
Andrew J. Hill
Beryl B. Cummings
Taru Tukiainen
Daniel P. Birnbaum
Jack A. Kosmicki
Laramie E. Duncan
Karol Estrada
Fengmei Zhao
James Zou
Emma Pierce-Hoffman
Joanne Berghout
David N. Cooper
Nicole Deflaux
Mark DePristo
Ron Do
Jason Flannick
Menachem Fromer
Laura Gauthier
Jackie Goldstein
Namrata Gupta
Daniel Howrigan
Adam Kiezun
Mitja I. Kurki
Ami Levy Moonshine
Pradeep Natarajan
Lorena Orozco
Gina M. Peloso
Ryan Poplin
Manuel A. Rivas
Valentin Ruano-Rubio
Samuel A. Rose
Douglas M. Ruderfer
Khalid Shakir
Peter D. Stenson
Christine Stevens
Brett P. Thomas
Grace Tiao
Maria T. Tusie-Luna
Ben Weisburd
Hong-Hee Won
Dongmei Yu
David M. Altshuler
机构
[1] Analytic and Translational Genetics Unit,Division of Genetics and Genomics
[2] Massachusetts General Hospital,Department of Genetics
[3] Program in Medical and Population Genetics,Department of Genetics and Genomic Sciences
[4] Broad Institute of MIT and Harvard,Department of Molecular Biology
[5] School of Paediatrics and Child Health,Department of Psychiatry
[6] University of Sydney,Department of Neurology
[7] Institute for Neuroscience and Muscle Research,Department of Cardiology
[8] Children’s Hospital at Westmead,Department of Biostatistics and Center for Statistical Genetics
[9] Program in Biological and Biomedical Sciences,Department of Public Health and Primary Care
[10] Harvard Medical School,Department of Pathology and Cancer Center
[11] Stanley Center for Psychiatric Research,Department of Psychiatry and Behavioral Sciences
[12] Broad Institute of MIT and Harvard,Department of Neuroscience and Physiology
[13] Boston Children’s Hospital,Department of Medical Epidemiology and Biostatistics
[14] Harvard Medical School,Department of Medicine
[15] National Heart and Lung Institute,Department of Biostatistics and Epidemiology
[16] Imperial College London,Department of Medicine
[17] NIHR Royal Brompton Cardiovascular Biomedical Research Unit,Department of Neuroscience
[18] Royal Brompton Hospital,Department of Genetics
[19] MRC Clinical Sciences Centre,Department of Medical Epidemiology and Biostatistics
[20] Imperial College London,Department of Public Health
[21] Genome Sciences,Department of Psychiatry
[22] University of Washington,Radcliffe Department of Medicine
[23] Program in Bioinformatics and Integrative Genomics,Department of Physiology and Biophysics
[24] Harvard Medical School,undefined
[25] Mouse Genome Informatics,undefined
[26] Jackson Laboratory,undefined
[27] Center for Biomedical Informatics and Biostatistics,undefined
[28] University of Arizona,undefined
[29] Institute of Medical Genetics,undefined
[30] Cardiff University,undefined
[31] Google,undefined
[32] Mountain View,undefined
[33] Broad Institute of MIT and Harvard,undefined
[34] Icahn School of Medicine at Mount Sinai,undefined
[35] Institute for Genomics and Multiscale Biology,undefined
[36] Icahn School of Medicine at Mount Sinai,undefined
[37] The Charles Bronfman Institute for Personalized Medicine,undefined
[38] Icahn School of Medicine at Mount Sinai,undefined
[39] The Center for Statistical Genetics,undefined
[40] Icahn School of Medicine at Mount Sinai,undefined
[41] Massachusetts General Hospital,undefined
[42] Icahn School of Medicine at Mount Sinai,undefined
[43] Psychiatric and Neurodevelopmental Genetics Unit,undefined
[44] Massachusetts General Hospital,undefined
[45] Harvard Medical School,undefined
[46] Center for Human Genetic Research,undefined
[47] Massachusetts General Hospital,undefined
[48] Cardiovascular Research Center,undefined
[49] Massachusetts General Hospital,undefined
[50] Immunogenomics and Metabolic Disease Laboratory,undefined
来源
Nature | 2016年 / 536卷
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human ‘knockout’ variants in protein-coding genes.
引用
收藏
页码:285 / 291
页数:6
相关论文
共 50 条
  • [41] Tandem repeat copy-number variation in protein-coding regions of human genes
    Colm T O'Dushlaine
    Richard J Edwards
    Stephen D Park
    Denis C Shields
    Genome Biology, 6
  • [42] Poly(T) variation within mitochondrial protein-coding genes in Globodera (Nematoda: Heteroderidae)
    Riepsamen, Angelique H.
    Blok, Vivian C.
    Phillips, Mark
    Gibson, Tracey
    Dowton, Mark
    JOURNAL OF MOLECULAR EVOLUTION, 2008, 66 (03) : 197 - 209
  • [43] Tandem repeat copy-number variation in protein-coding regions of human genes
    O'Dushlaine, CT
    Edwards, RJ
    Park, SD
    Shields, DC
    GENOME BIOLOGY, 2005, 6 (08)
  • [44] Poly(T) Variation Within Mitochondrial Protein-Coding Genes in Globodera (Nematoda: Heteroderidae)
    Angelique H. Riepsamen
    Vivian C. Blok
    Mark Phillips
    Tracey Gibson
    Mark Dowton
    Journal of Molecular Evolution, 2008, 66 : 197 - 209
  • [45] Identification and Analysis of SSRs Derived from Protein-coding Genes in Grape
    Pengfei WANG
    Ling SU
    Xilong JIANG
    Yingchun CHEN
    Fengshan REN
    Yongmei WANG
    Agricultural Science & Technology, 2017, 18 (09) : 1579 - 1584
  • [46] The genetic code is nearly optimal for allowing additional information within protein-coding sequences
    Itzkovitz, Shalev
    Alon, Uri
    GENOME RESEARCH, 2007, 17 (04) : 405 - 412
  • [47] LncRNAs and Protein-coding Genes Expression Analysis for Myelodysplastic Syndromes Diagnoses
    Al-Bakhat, Lama
    Al-Serhani, Norah
    2020 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE & MODERN ASSISTIVE TECHNOLOGY (ICAIMAT), 2020,
  • [48] CpG plus CpNpG analysis of protein-coding sequences from tomato
    Hobolth, Asger
    Nielsen, Rasmus
    Wang, Ying
    Wu, Feinan
    Tanksley, Steven D.
    MOLECULAR BIOLOGY AND EVOLUTION, 2006, 23 (06) : 1318 - 1323
  • [50] Analysis of nucleosome positioning in promoters of miRNA genes and protein-coding genes
    Liu HongDe
    Zhang DeJin
    Xie JianMing
    Yuan ZhiDong
    Ma Xin
    Lu ZhiYuan
    Gong LeJun
    Sun Xiao
    CHINESE SCIENCE BULLETIN, 2010, 55 (22): : 2347 - 2352