Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale

被引:0
|
作者
Xihao Li
Zilin Li
Hufeng Zhou
Sheila M. Gaynor
Yaowu Liu
Han Chen
Ryan Sun
Rounak Dey
Donna K. Arnett
Stella Aslibekyan
Christie M. Ballantyne
Lawrence F. Bielak
John Blangero
Eric Boerwinkle
Donald W. Bowden
Jai G. Broome
Matthew P. Conomos
Adolfo Correa
L. Adrienne Cupples
Joanne E. Curran
Barry I. Freedman
Xiuqing Guo
George Hindy
Marguerite R. Irvin
Sharon L. R. Kardia
Sekar Kathiresan
Alyna T. Khan
Charles L. Kooperberg
Cathy C. Laurie
X. Shirley Liu
Michael C. Mahaney
Ani W. Manichaikul
Lisa W. Martin
Rasika A. Mathias
Stephen T. McGarvey
Braxton D. Mitchell
May E. Montasser
Jill E. Moore
Alanna C. Morrison
Jeffrey R. O’Connell
Nicholette D. Palmer
Akhil Pampana
Juan M. Peralta
Patricia A. Peyser
Bruce M. Psaty
Susan Redline
Kenneth M. Rice
Stephen S. Rich
Jennifer A. Smith
Hemant K. Tiwari
机构
[1] Harvard T.H. Chan School of Public Health,Department of Biostatistics
[2] Southwestern University of Finance and Economics,School of Statistics
[3] The University of Texas Health Science Center at Houston,Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health
[4] The University of Texas Health Science Center at Houston,Center for Precision Health, School of Public Health and School of Biomedical Informatics
[5] University of Texas MD Anderson Cancer Center,Department of Biostatistics
[6] College of Public Health,Department of Epidemiology
[7] University of Kentucky,Department of Medicine
[8] University of Alabama at Birmingham,Department of Epidemiology, School of Public Health
[9] Baylor College of Medicine,Human Genome Sequencing Center
[10] University of Michigan,Department of Biochemistry
[11] Department of Human Genetics and South Texas Diabetes and Obesity Institute,Division of Medical Genetics
[12] School of Medicine,Department of Biostatistics
[13] The University of Texas Rio Grande Valley,Jackson Heart Study, Department of Medicine
[14] Baylor College of Medicine,Department of Biostatistics
[15] Wake Forest University School of Medicine,Framingham Heart Study
[16] University of Washington,Department of Internal Medicine, Nephrology
[17] University of Washington,Department of Population Medicine
[18] University of Mississippi Medical Center,Cardiology Division
[19] Boston University School of Public Health,Department of Medicine
[20] National Heart,Division of Public Health Sciences
[21] Lung,Department of Data Sciences
[22] and Blood Institute and Boston University,Department of Statistics
[23] Wake Forest School of Medicine,Center for Public Health Genomics
[24] The Institute for Translational Genomics and Population Sciences,Division of Cardiology
[25] Department of Pediatrics,GeneSTAR Research Program, Department of Medicine
[26] The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center,Department of Epidemiology
[27] Qatar University College of Medicine,Department of Medicine
[28] QU Health,Geriatrics Research and Education Clinical Center
[29] Verve Therapeutics,Division of Endocrinology, Diabetes, and Nutrition, Program for Personalized and Genomic Medicine
[30] Massachusetts General Hospital,Program in Bioinformatics and Integrative Biology
[31] Harvard Medical School,Program in Medical and Population Genetics
[32] Fred Hutchinson Cancer Research Center,Center for Genomic Medicine and Cardiovascular Research Center
[33] Dana–Farber Cancer Institute and Harvard T.H. Chan School of Public Health,Cardiovascular Health Research Unit, Departments of Medicine, Epidemiology, and Health Services
[34] Harvard University,Division of Sleep and Circadian Disorders
[35] University of Virginia,Division of Sleep Medicine
[36] George Washington School of Medicine and Health Sciences,Division of Pulmonary, Critical Care, and Sleep Medicine
[37] Johns Hopkins University School of Medicine,Survey Research Center
[38] International Health Institute,Department of Biostatistics, School of Public Health
[39] Department of Anthropology,Department of Laboratory Medicine and Pathology
[40] Brown University,Department of Medicine
[41] University of Maryland School of Medicine,Department of Human Genetics and Biostatistics
[42] Baltimore VA Medical Center,Department of Physiology and Biophysics
[43] University of Maryland School of Medicine,Division of Cardiology
[44] University of Massachusetts Medical School,Stanley Center for Psychiatric Research
[45] Broad Institute of Harvard and MIT,Analytic and Translational Genetics Unit
[46] Massachusetts General Hospital,Division of Genetics
[47] University of Washington,Department of Biomedical Informatics
[48] Kaiser Permanente Washington Health Research Institute,Department of Biostatistics
[49] Brigham and Women’s Hospital,Department of Internal Medicine
[50] Harvard Medical School,Department of Computational Medicine and Bioinformatics
来源
Nature Genetics | 2020年 / 52卷
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Large-scale whole-genome sequencing studies have enabled the analysis of rare variants (RVs) associated with complex phenotypes. Commonly used RV association tests have limited scope to leverage variant functions. We propose STAAR (variant-set test for association using annotation information), a scalable and powerful RV association test method that effectively incorporates both variant categories and multiple complementary annotations using a dynamic weighting scheme. For the latter, we introduce ‘annotation principal components’, multidimensional summaries of in silico variant annotations. STAAR accounts for population structure and relatedness and is scalable for analyzing very large cohort and biobank whole-genome sequencing studies of continuous and dichotomous traits. We applied STAAR to identify RVs associated with four lipid traits in 12,316 discovery and 17,822 replication samples from the Trans-Omics for Precision Medicine Program. We discovered and replicated new RV associations, including disruptive missense RVs of NPC1L1 and an intergenic region near APOC1P1 associated with low-density lipoprotein cholesterol.
引用
收藏
页码:969 / 983
页数:14
相关论文
共 41 条
  • [11] Group-based variant calling leveraging next-generation supercomputing for large-scale whole-genome sequencing studies
    Kristopher A. Standish
    Tristan M. Carland
    Glenn K. Lockwood
    Wayne Pfeiffer
    Mahidhar Tatineni
    C Chris Huang
    Sarah Lamberth
    Yauheniya Cherkas
    Carrie Brodmerkel
    Ed Jaeger
    Lance Smith
    Gunaretnam Rajagopal
    Mark E. Curran
    Nicholas J. Schork
    BMC Bioinformatics, 16
  • [12] Group-based variant calling leveraging next-generation supercomputing for large-scale whole-genome sequencing studies
    Standish, Kristopher A.
    Carland, Tristan M.
    Lockwood, Glenn K.
    Pfeiffer, Wayne
    Tatineni, Mahidhar
    Huang, C. Chris
    Lamberth, Sarah
    Cherkas, Yauheniya
    Brodmerkel, Carrie
    Jaeger, Ed
    Smith, Lance
    Rajagopal, Gunaretnam
    Curran, Mark E.
    Schork, Nicholas J.
    BMC BIOINFORMATICS, 2015, 16
  • [15] Comprehensive Rare Variant Analysis via Whole-Genome Sequencing to Determine the Molecular Pathology of Inherited Retinal Disease
    Carss, Keren J.
    Arno, Gavin
    Erwood, Marie
    Stephens, Jonathan
    Sanchis-Juan, Alba
    Hull, Sarah
    Megy, Karyn
    Grozeva, Detelina
    Dewhurst, Eleanor
    Malka, Samantha
    Plagnol, Vincent
    Penkett, Christopher
    Stirrups, Kathleen
    Rizzo, Roberta
    Wright, Genevieve
    Josifova, Dragana
    Bitner-Glindzicz, Maria
    Scott, Richard H.
    Clement, Emma
    Allen, Louise
    Armstrong, Ruth
    Brady, Angela F.
    Carmichael, Jenny
    Chitre, Manali
    Henderson, Robert H. H.
    Hurst, Jane
    MacLaren, Robert E.
    Murphy, Elaine
    Paterson, Joan
    Rosser, Elisabeth
    Thompson, Dorothy A.
    Wakeling, Emma
    Ouwehand, Willem H.
    Michaelides, Michel
    Moore, Anthony T.
    Webster, Andrew R.
    Raymond, F. Lucy
    AMERICAN JOURNAL OF HUMAN GENETICS, 2017, 100 (01) : 75 - 90
  • [16] Powerful, scalable and resource-efficient meta-analysis of rare variant associations in large whole genome sequencing studies
    Li, Xihao
    Quick, Corbin
    Zhou, Hufeng
    Gaynor, Sheila M.
    Liu, Yaowu
    Chen, Han
    Selvaraj, Margaret Sunitha
    Sun, Ryan
    Dey, Rounak
    Arnett, Donna K.
    Bielak, Lawrence F.
    Bis, Joshua C.
    Blangero, John
    Boerwinkle, Eric
    Bowden, Donald W.
    Brody, Jennifer A.
    Cade, Brian E.
    Correa, Adolfo
    Cupples, L. Adrienne
    Curran, Joanne E.
    de Vries, Paul S.
    Duggirala, Ravindranath
    Freedman, Barry I.
    Goring, Harald H. H.
    Guo, Xiuqing
    Haessler, Jeffrey
    Kalyani, Rita R.
    Kooperberg, Charles
    Kral, Brian G.
    Lange, Leslie A.
    Manichaikul, Ani
    Martin, Lisa W.
    McGarvey, Stephen T.
    Mitchell, Braxton D.
    Montasser, May E.
    Morrison, Alanna C.
    Naseri, Take
    O'Connell, Jeffrey R.
    Palmer, Nicholette D.
    Peyser, Patricia A.
    Psaty, Bruce M.
    Raffield, Laura M.
    Redline, Susan
    Reiner, Alexander P.
    Reupena, Muagututi'a Sefuiva
    Rice, Kenneth M.
    Rich, Stephen S.
    Sitlani, Colleen M.
    Smith, Jennifer A.
    Taylor, Kent D.
    NATURE GENETICS, 2023, 55 (01) : 154 - +
  • [17] Powerful, scalable and resource-efficient meta-analysis of rare variant associations in large whole genome sequencing studies
    Xihao Li
    Corbin Quick
    Hufeng Zhou
    Sheila M. Gaynor
    Yaowu Liu
    Han Chen
    Margaret Sunitha Selvaraj
    Ryan Sun
    Rounak Dey
    Donna K. Arnett
    Lawrence F. Bielak
    Joshua C. Bis
    John Blangero
    Eric Boerwinkle
    Donald W. Bowden
    Jennifer A. Brody
    Brian E. Cade
    Adolfo Correa
    L. Adrienne Cupples
    Joanne E. Curran
    Paul S. de Vries
    Ravindranath Duggirala
    Barry I. Freedman
    Harald H. H. Göring
    Xiuqing Guo
    Jeffrey Haessler
    Rita R. Kalyani
    Charles Kooperberg
    Brian G. Kral
    Leslie A. Lange
    Ani Manichaikul
    Lisa W. Martin
    Stephen T. McGarvey
    Braxton D. Mitchell
    May E. Montasser
    Alanna C. Morrison
    Take Naseri
    Jeffrey R. O’Connell
    Nicholette D. Palmer
    Patricia A. Peyser
    Bruce M. Psaty
    Laura M. Raffield
    Susan Redline
    Alexander P. Reiner
    Muagututi’a Sefuiva Reupena
    Kenneth M. Rice
    Stephen S. Rich
    Colleen M. Sitlani
    Jennifer A. Smith
    Kent D. Taylor
    Nature Genetics, 2023, 55 (1) : 154 - 164
  • [18] Rainbow: a tool for large-scale whole-genome sequencing data analysis using cloud computing
    Zhao, Shanrong
    Prenger, Kurt
    Smith, Lance
    Messina, Thomas
    Fan, Hongtao
    Jaeger, Edward
    Stephens, Susan
    BMC GENOMICS, 2013, 14
  • [19] A large-scale whole-genome sequencing analysis reveals false positives of bacterial essential genes
    Yuanhao Li
    Bo Jiang
    Weijun Dai
    Applied Microbiology and Biotechnology, 2022, 106 : 341 - 347
  • [20] A large-scale whole-genome sequencing analysis reveals false positives of bacterial essential genes
    Li, Yuanhao
    Jiang, Bo
    Dai, Weijun
    APPLIED MICROBIOLOGY AND BIOTECHNOLOGY, 2022, 106 (01) : 341 - 347