Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale

被引:0
|
作者
Xihao Li
Zilin Li
Hufeng Zhou
Sheila M. Gaynor
Yaowu Liu
Han Chen
Ryan Sun
Rounak Dey
Donna K. Arnett
Stella Aslibekyan
Christie M. Ballantyne
Lawrence F. Bielak
John Blangero
Eric Boerwinkle
Donald W. Bowden
Jai G. Broome
Matthew P. Conomos
Adolfo Correa
L. Adrienne Cupples
Joanne E. Curran
Barry I. Freedman
Xiuqing Guo
George Hindy
Marguerite R. Irvin
Sharon L. R. Kardia
Sekar Kathiresan
Alyna T. Khan
Charles L. Kooperberg
Cathy C. Laurie
X. Shirley Liu
Michael C. Mahaney
Ani W. Manichaikul
Lisa W. Martin
Rasika A. Mathias
Stephen T. McGarvey
Braxton D. Mitchell
May E. Montasser
Jill E. Moore
Alanna C. Morrison
Jeffrey R. O’Connell
Nicholette D. Palmer
Akhil Pampana
Juan M. Peralta
Patricia A. Peyser
Bruce M. Psaty
Susan Redline
Kenneth M. Rice
Stephen S. Rich
Jennifer A. Smith
Hemant K. Tiwari
机构
[1] Harvard T.H. Chan School of Public Health,Department of Biostatistics
[2] Southwestern University of Finance and Economics,School of Statistics
[3] The University of Texas Health Science Center at Houston,Human Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health
[4] The University of Texas Health Science Center at Houston,Center for Precision Health, School of Public Health and School of Biomedical Informatics
[5] University of Texas MD Anderson Cancer Center,Department of Biostatistics
[6] College of Public Health,Department of Epidemiology
[7] University of Kentucky,Department of Medicine
[8] University of Alabama at Birmingham,Department of Epidemiology, School of Public Health
[9] Baylor College of Medicine,Human Genome Sequencing Center
[10] University of Michigan,Department of Biochemistry
[11] Department of Human Genetics and South Texas Diabetes and Obesity Institute,Division of Medical Genetics
[12] School of Medicine,Department of Biostatistics
[13] The University of Texas Rio Grande Valley,Jackson Heart Study, Department of Medicine
[14] Baylor College of Medicine,Department of Biostatistics
[15] Wake Forest University School of Medicine,Framingham Heart Study
[16] University of Washington,Department of Internal Medicine, Nephrology
[17] University of Washington,Department of Population Medicine
[18] University of Mississippi Medical Center,Cardiology Division
[19] Boston University School of Public Health,Department of Medicine
[20] National Heart,Division of Public Health Sciences
[21] Lung,Department of Data Sciences
[22] and Blood Institute and Boston University,Department of Statistics
[23] Wake Forest School of Medicine,Center for Public Health Genomics
[24] The Institute for Translational Genomics and Population Sciences,Division of Cardiology
[25] Department of Pediatrics,GeneSTAR Research Program, Department of Medicine
[26] The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center,Department of Epidemiology
[27] Qatar University College of Medicine,Department of Medicine
[28] QU Health,Geriatrics Research and Education Clinical Center
[29] Verve Therapeutics,Division of Endocrinology, Diabetes, and Nutrition, Program for Personalized and Genomic Medicine
[30] Massachusetts General Hospital,Program in Bioinformatics and Integrative Biology
[31] Harvard Medical School,Program in Medical and Population Genetics
[32] Fred Hutchinson Cancer Research Center,Center for Genomic Medicine and Cardiovascular Research Center
[33] Dana–Farber Cancer Institute and Harvard T.H. Chan School of Public Health,Cardiovascular Health Research Unit, Departments of Medicine, Epidemiology, and Health Services
[34] Harvard University,Division of Sleep and Circadian Disorders
[35] University of Virginia,Division of Sleep Medicine
[36] George Washington School of Medicine and Health Sciences,Division of Pulmonary, Critical Care, and Sleep Medicine
[37] Johns Hopkins University School of Medicine,Survey Research Center
[38] International Health Institute,Department of Biostatistics, School of Public Health
[39] Department of Anthropology,Department of Laboratory Medicine and Pathology
[40] Brown University,Department of Medicine
[41] University of Maryland School of Medicine,Department of Human Genetics and Biostatistics
[42] Baltimore VA Medical Center,Department of Physiology and Biophysics
[43] University of Maryland School of Medicine,Division of Cardiology
[44] University of Massachusetts Medical School,Stanley Center for Psychiatric Research
[45] Broad Institute of Harvard and MIT,Analytic and Translational Genetics Unit
[46] Massachusetts General Hospital,Division of Genetics
[47] University of Washington,Department of Biomedical Informatics
[48] Kaiser Permanente Washington Health Research Institute,Department of Biostatistics
[49] Brigham and Women’s Hospital,Department of Internal Medicine
[50] Harvard Medical School,Department of Computational Medicine and Bioinformatics
来源
Nature Genetics | 2020年 / 52卷
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Large-scale whole-genome sequencing studies have enabled the analysis of rare variants (RVs) associated with complex phenotypes. Commonly used RV association tests have limited scope to leverage variant functions. We propose STAAR (variant-set test for association using annotation information), a scalable and powerful RV association test method that effectively incorporates both variant categories and multiple complementary annotations using a dynamic weighting scheme. For the latter, we introduce ‘annotation principal components’, multidimensional summaries of in silico variant annotations. STAAR accounts for population structure and relatedness and is scalable for analyzing very large cohort and biobank whole-genome sequencing studies of continuous and dichotomous traits. We applied STAAR to identify RVs associated with four lipid traits in 12,316 discovery and 17,822 replication samples from the Trans-Omics for Precision Medicine Program. We discovered and replicated new RV associations, including disruptive missense RVs of NPC1L1 and an intergenic region near APOC1P1 associated with low-density lipoprotein cholesterol.
引用
收藏
页码:969 / 983
页数:14
相关论文
共 41 条
  • [1] Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale
    Li, Xihao
    Li, Zilin
    Zhou, Hufeng
    Gaynor, Sheila M.
    Liu, Yaowu
    Chen, Han
    Sun, Ryan
    Dey, Rounak
    Arnett, Donna K.
    Aslibekyan, Stella
    Ballantyne, Christie M.
    Bielak, Lawrence F.
    Blangero, John
    Boerwinkle, Eric
    Bowden, Donald W.
    Broome, Jai G.
    Conomos, Matthew P.
    Correa, Adolfo
    Cupples, L. Adrienne
    Curran, Joanne E.
    Freedman, Barry I.
    Guo, Xiuqing
    Hindy, George
    Irvin, Marguerite R.
    Kardia, Sharon L. R.
    Kathiresan, Sekar
    Khan, Alyna T.
    Kooperberg, Charles L.
    Laurie, Cathy C.
    Liu, X. Shirley
    Mahaney, Michael C.
    Manichaikul, Ani W.
    Martin, Lisa W.
    Mathias, Rasika A.
    McGarvey, Stephen T.
    Mitchell, Braxton D.
    Montasser, May E.
    Moore, Jill E.
    Morrison, Alanna C.
    O'Connell, Jeffrey R.
    Palmer, Nicholette D.
    Pampana, Akhil
    Peralta, Juan M.
    Peyser, Patricia A.
    Psaty, Bruce M.
    Redline, Susan
    Rice, Kenneth M.
    Rich, Stephen S.
    Smith, Jennifer A.
    Tiwari, Hemant K.
    NATURE GENETICS, 2020, 52 (09) : 969 - +
  • [2] Dynamic Scan Procedure for Detecting Rare-Variant Association Regions in Whole-Genome Sequencing Studies
    Li, Zilin
    Li, Xihao
    Liu, Yaowu
    Shen, Jincheng
    Chen, Han
    Zhou, Hufeng
    Morrison, Alanna C.
    Boerwinkle, Eric
    Lin, Xihong
    AMERICAN JOURNAL OF HUMAN GENETICS, 2019, 104 (05) : 802 - 814
  • [3] A statistical framework for multi-trait rare variant analysis in large-scale whole-genome sequencing studies
    Li, Xihao
    Chen, Han
    Selvaraj, Margaret Sunitha
    Van Buren, Eric
    Zhou, Hufeng
    Wang, Yuxuan
    Sun, Ryan
    McCaw, Zachary R.
    Yu, Zhi
    Jiang, Min-Zhi
    DiCorpo, Daniel
    Gaynor, Sheila M.
    Dey, Rounak
    Arnett, Donna K.
    Benjamin, Emelia J.
    Bis, Joshua C.
    Blangero, John
    Boerwinkle, Eric
    Bowden, Donald W.
    Brody, Jennifer A.
    Cade, Brian E.
    Carson, April P.
    Carlson, Jenna C.
    Chami, Nathalie
    Chen, Yii-Der Ida
    Curran, Joanne E.
    de Vries, Paul S.
    Fornage, Myriam
    Franceschini, Nora
    Freedman, Barry I.
    Gu, Charles
    Heard-Costa, Nancy L.
    He, Jiang
    Hou, Lifang
    Hung, Yi-Jen
    Irvin, Marguerite R.
    Kaplan, Robert C.
    Kardia, Sharon L. R.
    Kelly, Tanika N.
    Konigsberg, Iain
    Kooperberg, Charles
    Kral, Brian G.
    Li, Changwei
    Li, Yun
    Lin, Honghuang
    Liu, Ching-Ti
    Loos, Ruth J. F.
    Mahaney, Michael C.
    Martin, Lisa W.
    Mathias, Rasika A.
    NATURE COMPUTATIONAL SCIENCE, 2025, : 125 - 143
  • [4] A framework for detecting noncoding rare-variant associations of large-scale whole-genome sequencing studies
    Zilin Li
    Xihao Li
    Hufeng Zhou
    Sheila M. Gaynor
    Margaret Sunitha Selvaraj
    Theodore Arapoglou
    Corbin Quick
    Yaowu Liu
    Han Chen
    Ryan Sun
    Rounak Dey
    Donna K. Arnett
    Paul L. Auer
    Lawrence F. Bielak
    Joshua C. Bis
    Thomas W. Blackwell
    John Blangero
    Eric Boerwinkle
    Donald W. Bowden
    Jennifer A. Brody
    Brian E. Cade
    Matthew P. Conomos
    Adolfo Correa
    L. Adrienne Cupples
    Joanne E. Curran
    Paul S. de Vries
    Ravindranath Duggirala
    Nora Franceschini
    Barry I. Freedman
    Harald H. H. Göring
    Xiuqing Guo
    Rita R. Kalyani
    Charles Kooperberg
    Brian G. Kral
    Leslie A. Lange
    Bridget M. Lin
    Ani Manichaikul
    Alisa K. Manning
    Lisa W. Martin
    Rasika A. Mathias
    James B. Meigs
    Braxton D. Mitchell
    May E. Montasser
    Alanna C. Morrison
    Take Naseri
    Jeffrey R. O’Connell
    Nicholette D. Palmer
    Patricia A. Peyser
    Bruce M. Psaty
    Laura M. Raffield
    Nature Methods, 2022, 19 : 1599 - 1611
  • [5] A framework for detecting noncoding rare-variant associations of large-scale whole-genome sequencing studies
    Li, Zilin
    Li, Xihao
    Zhou, Hufeng
    Gaynor, Sheila M.
    Selvaraj, Margaret Sunitha
    Arapoglou, Theodore
    Quick, Corbin
    Liu, Yaowu
    Chen, Han
    Sun, Ryan
    Dey, Rounak
    Arnett, Donna K.
    Auer, Paul L.
    Bielak, Lawrence F.
    Bis, Joshua C.
    Blackwell, Thomas W.
    Blangero, John
    Boerwinkle, Eric
    Bowden, Donald W.
    Brody, Jennifer A.
    Cade, Brian E.
    Conomos, Matthew P.
    Correa, Adolfo
    Cupples, L. Adrienne
    Curran, Joanne E.
    de Vries, Paul S.
    Duggirala, Ravindranath
    Franceschini, Nora
    Freedman, Barry, I
    Goring, Harald H. H.
    Guo, Xiuqing
    Kalyani, Rita R.
    Kooperberg, Charles
    Kral, Brian G.
    Lange, Leslie A.
    Lin, Bridget M.
    Manichaikul, Ani
    Manning, Alisa K.
    Martin, Lisa W.
    Mathias, Rasika A.
    Meigs, James B.
    Mitchell, Braxton D.
    Montasser, May E.
    Morrison, Alanna C.
    Naseri, Take
    O'Connell, Jeffrey R.
    Palmer, Nicholette D.
    Peyser, Patricia A.
    Psaty, Bruce M.
    Raffield, Laura M.
    NATURE METHODS, 2022, 19 (12) : 1599 - +
  • [6] Rare variant analysis in large-scale association and sequencing studies
    Zeggini, Eleftheria
    JOURNAL OF MEDICAL GENETICS, 2011, 48 : S24 - S24
  • [7] SEQSpark: A Complete Analysis Tool for Large-Scale Rare Variant Association Studies Using Whole-Genome and Exome Sequence Data
    Zhang, Di
    Zhao, Linhai
    Li, Biao
    He, Zongxiao
    Wang, Gao T.
    Liu, Dajiang J.
    Leal, Suzanne M.
    AMERICAN JOURNAL OF HUMAN GENETICS, 2017, 101 (01) : 115 - 122
  • [8] Comprehensive rare variant analysis of individuals with neurodevelopmental disorders by whole-genome sequencing
    Sanchis-Juan, A.
    Armirola, C.
    Megy, K.
    Low, K.
    French, C. E.
    Grozeva, D.
    Dewhurst, E.
    Stephens, J.
    Stirrups, K.
    Erwood, M.
    Penkett, C.
    Shamardina, O.
    Ambegaonkar, G.
    Chitre, M.
    Josifova, D.
    Kurian, M.
    Parker, A.
    Rankin, J.
    Reid, E.
    Wakeling, E.
    Wassmer, E.
    Woods, G.
    Ouwehand, W. H.
    Raymond, F.
    Carss, K. J.
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2019, 27 : 1471 - 1471
  • [9] Efficient Variant Set Mixed Model Association Tests for Continuous and Binary Traits in Large-Scale Whole-Genome Sequencing Studies
    Chen, Han
    Huffman, Jennifer E.
    Brody, Jennifer A.
    Wang, Chaolong
    Lee, Seunggeun
    Li, Zilin
    Gogarten, Stephanie M.
    Sofer, Tamar
    Bielak, Lawrence F.
    Bis, Joshua C.
    Blangero, John
    Bowler, Russell P.
    Cade, Brian E.
    Cho, Michael H.
    Correa, Adolfo
    Curran, Joanne E.
    de Vries, Paul S.
    Glahn, David C.
    Guo, Xiuqing
    Johnson, Andrew D.
    Kardia, Sharon
    Kooperberg, Charles
    Lewis, Joshua P.
    Liu, Xiaoming
    Mathias, Rasika A.
    Mitchell, Braxton D.
    O'Connell, Jeffrey R.
    Peyser, Patricia A.
    Post, Wendy S.
    Reiner, Alex P.
    Rich, Stephen S.
    Rotter, Jerome I.
    Silverman, Edwin K.
    Smith, Jennifer A.
    Vasan, Ramachandran S.
    Wilson, James G.
    Yanek, Lisa R.
    Redline, Susan
    Smith, Nicholas L.
    Boerwinkle, Eric
    Borecki, Ingrid B.
    Cupples, L. Adrienne
    Laurie, Cathy C.
    Morrison, Alanna C.
    Rice, Kenneth M.
    Lin, Xihong
    AMERICAN JOURNAL OF HUMAN GENETICS, 2019, 104 (02) : 260 - 274
  • [10] SEQSpark: A Complete Analysis Tool for Large-Scale Rare Variant Association Studies using Whole Genome and Exome Sequence Data
    Zhang, Di
    Zhao, Linhai
    Li, Biao
    He, Zongxiao
    Wang, Gao T.
    Liu, Dajiang J.
    Leal, Suzanne M.
    GENETIC EPIDEMIOLOGY, 2017, 41 (07) : 646 - 646