Analysis of protein-coding genetic variation in 60,706 humans

被引:0
|
作者
Monkol Lek
Konrad J. Karczewski
Eric V. Minikel
Kaitlin E. Samocha
Eric Banks
Timothy Fennell
Anne H. O’Donnell-Luria
James S. Ware
Andrew J. Hill
Beryl B. Cummings
Taru Tukiainen
Daniel P. Birnbaum
Jack A. Kosmicki
Laramie E. Duncan
Karol Estrada
Fengmei Zhao
James Zou
Emma Pierce-Hoffman
Joanne Berghout
David N. Cooper
Nicole Deflaux
Mark DePristo
Ron Do
Jason Flannick
Menachem Fromer
Laura Gauthier
Jackie Goldstein
Namrata Gupta
Daniel Howrigan
Adam Kiezun
Mitja I. Kurki
Ami Levy Moonshine
Pradeep Natarajan
Lorena Orozco
Gina M. Peloso
Ryan Poplin
Manuel A. Rivas
Valentin Ruano-Rubio
Samuel A. Rose
Douglas M. Ruderfer
Khalid Shakir
Peter D. Stenson
Christine Stevens
Brett P. Thomas
Grace Tiao
Maria T. Tusie-Luna
Ben Weisburd
Hong-Hee Won
Dongmei Yu
David M. Altshuler
机构
[1] Analytic and Translational Genetics Unit,Division of Genetics and Genomics
[2] Massachusetts General Hospital,Department of Genetics
[3] Program in Medical and Population Genetics,Department of Genetics and Genomic Sciences
[4] Broad Institute of MIT and Harvard,Department of Molecular Biology
[5] School of Paediatrics and Child Health,Department of Psychiatry
[6] University of Sydney,Department of Neurology
[7] Institute for Neuroscience and Muscle Research,Department of Cardiology
[8] Children’s Hospital at Westmead,Department of Biostatistics and Center for Statistical Genetics
[9] Program in Biological and Biomedical Sciences,Department of Public Health and Primary Care
[10] Harvard Medical School,Department of Pathology and Cancer Center
[11] Stanley Center for Psychiatric Research,Department of Psychiatry and Behavioral Sciences
[12] Broad Institute of MIT and Harvard,Department of Neuroscience and Physiology
[13] Boston Children’s Hospital,Department of Medical Epidemiology and Biostatistics
[14] Harvard Medical School,Department of Medicine
[15] National Heart and Lung Institute,Department of Biostatistics and Epidemiology
[16] Imperial College London,Department of Medicine
[17] NIHR Royal Brompton Cardiovascular Biomedical Research Unit,Department of Neuroscience
[18] Royal Brompton Hospital,Department of Genetics
[19] MRC Clinical Sciences Centre,Department of Medical Epidemiology and Biostatistics
[20] Imperial College London,Department of Public Health
[21] Genome Sciences,Department of Psychiatry
[22] University of Washington,Radcliffe Department of Medicine
[23] Program in Bioinformatics and Integrative Genomics,Department of Physiology and Biophysics
[24] Harvard Medical School,undefined
[25] Mouse Genome Informatics,undefined
[26] Jackson Laboratory,undefined
[27] Center for Biomedical Informatics and Biostatistics,undefined
[28] University of Arizona,undefined
[29] Institute of Medical Genetics,undefined
[30] Cardiff University,undefined
[31] Google,undefined
[32] Mountain View,undefined
[33] Broad Institute of MIT and Harvard,undefined
[34] Icahn School of Medicine at Mount Sinai,undefined
[35] Institute for Genomics and Multiscale Biology,undefined
[36] Icahn School of Medicine at Mount Sinai,undefined
[37] The Charles Bronfman Institute for Personalized Medicine,undefined
[38] Icahn School of Medicine at Mount Sinai,undefined
[39] The Center for Statistical Genetics,undefined
[40] Icahn School of Medicine at Mount Sinai,undefined
[41] Massachusetts General Hospital,undefined
[42] Icahn School of Medicine at Mount Sinai,undefined
[43] Psychiatric and Neurodevelopmental Genetics Unit,undefined
[44] Massachusetts General Hospital,undefined
[45] Harvard Medical School,undefined
[46] Center for Human Genetic Research,undefined
[47] Massachusetts General Hospital,undefined
[48] Cardiovascular Research Center,undefined
[49] Massachusetts General Hospital,undefined
[50] Immunogenomics and Metabolic Disease Laboratory,undefined
来源
Nature | 2016年 / 536卷
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human ‘knockout’ variants in protein-coding genes.
引用
收藏
页码:285 / 291
页数:6
相关论文
共 50 条
  • [21] A deep catalogue of protein-coding variation in 983,578 individuals
    Sun, Kathie Y.
    Bai, Xiaodong
    Chen, Siying
    Bao, Suying
    Zhang, Chuanyi
    Kapoor, Manav
    Backman, Joshua
    Joseph, Tyler
    Maxwell, Evan
    Mitra, George
    Gorovits, Alexander
    Mansfield, Adam
    Boutkov, Boris
    Gokhale, Sujit
    Habegger, Lukas
    Marcketta, Anthony
    Locke, Adam E.
    Ganel, Liron
    Hawes, Alicia
    Kessler, Michael D.
    Sharma, Deepika
    Staples, Jeffrey
    Bovijn, Jonas
    Gelfman, Sahar
    Di Gioia, Alessandro
    Rajagopal, Veera M.
    Lopez, Alexander
    Varela, Jennifer Rico
    Alegre-Diaz, Jesus
    Berumen, Jaime
    Tapia-Conyer, Roberto
    Kuri-Morales, Pablo
    Torres, Jason
    Emberson, Jonathan
    Collins, Rory
    Cantor, Michael
    Thornton, Timothy
    Kang, Hyun Min
    Overton, John D.
    Shuldiner, Alan R.
    Cremona, M. Laura
    Nafde, Mona
    Baras, Aris
    Abecasis, Goncalo
    Marchini, Jonathan
    Reid, Jeffrey G.
    Salerno, William
    Balasubramanian, Suganthi
    NATURE, 2024, 631 (8021) : 583 - +
  • [22] Computational identification of protein-coding sequences by comparative analysis
    Fontaine, Arnaud
    Touzet, Helene
    2007 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, PROCEEDINGS, 2007, : 95 - 102
  • [23] Computational identification of protein-coding sequences by comparative analysis
    Fontaine, Arnaud
    Touzet, Helene
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2009, 3 (02) : 160 - 176
  • [24] Identification of Novel Protein-Coding Genetic Variants Associated with Takayasu Arteritis
    Renauer, Paul
    Coit, Patrick
    Merkel, Peter A.
    Sawalha, Amr H.
    ARTHRITIS & RHEUMATOLOGY, 2015, 67
  • [25] Genetic Screening for Low-Penetrance Variants in Protein-Coding Genes
    Waalen, Jill
    Beutler, Ernest
    ANNUAL REVIEW OF GENOMICS AND HUMAN GENETICS, 2009, 10 : 431 - 450
  • [26] Evidence for genetic drift in endosymbionts (Buchnera):: Analyses of protein-coding genes
    Wernegreen, JJ
    Moran, NA
    MOLECULAR BIOLOGY AND EVOLUTION, 1999, 16 (01) : 83 - 97
  • [27] A VARIATION IN THE STRUCTURE OF THE PROTEIN-CODING REGION OF THE HUMAN-P53 GENE
    BUCHMAN, VL
    CHUMAKOV, PM
    NINKINA, NN
    SAMARINA, OP
    GEORGIEV, GP
    GENE, 1988, 70 (02) : 245 - 252
  • [28] Analysis of genetic variants in protein-coding genes of Aoluguya reindeer based on the whole-genome data
    Yi, Wenfeng
    Hu, Mingyue
    Shi, Lulu
    Li, Ting
    Sun, Hao
    Qin, Lihong
    Yan, Shouqing
    ANIMAL GENETICS, 2024, 55 (02) : 296 - 298
  • [29] Variation and genetic control of protein abundance in humans
    Linfeng Wu
    Sophie I. Candille
    Yoonha Choi
    Dan Xie
    Lihua Jiang
    Jennifer Li-Pook-Than
    Hua Tang
    Michael Snyder
    Nature, 2013, 499 : 79 - 82
  • [30] Variation and genetic control of protein abundance in humans
    Wu, Linfeng
    Candille, Sophie I.
    Choi, Yoonha
    Xie, Dan
    Jiang, Lihua
    Li-Pook-Than, Jennifer
    Tang, Hua
    Snyder, Michael
    NATURE, 2013, 499 (7456) : 79 - 82