Excalibur: A new ensemble method based on an optimal combination of aggregation tests for rare-variant association testing for sequencing data

被引:1
|
作者
Boutry, Simon [1 ,2 ]
Helaers, Raphael [1 ]
Lenaerts, Tom [2 ,3 ,4 ]
Vikkula, Miikka [1 ,5 ]
机构
[1] Univ Louvain, Human Mol Genet, de Duve Inst, Brussels, Belgium
[2] Vrije Univ Brussel, Univ Libre Bruxelles, Interuniv Inst Bioinformat Brussels, Brussels, Belgium
[3] Univ Libre Bruxelles, Machine Learning Grp, Brussels, Belgium
[4] Vrije Univ Brussel, Artificial Intelligence Lab, Brussels, Belgium
[5] WEL Res Inst, WELBIO Dept, Wavre, Belgium
关键词
STATISTICAL TESTS; DISEASE ASSOCIATION; COMMON DISEASES; DETECTING ASSOCIATIONS; GENETIC ASSOCIATION; GENERAL FRAMEWORK; MULTIPLE SNPS; R PACKAGE; POWER; PATHOGENICITY;
D O I
10.1371/journal.pcbi.1011488
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The development of high-throughput next-generation sequencing technologies and large-scale genetic association studies produced numerous advances in the biostatistics field. Various aggregation tests, i.e. statistical methods that analyze associations of a trait with multiple markers within a genomic region, have produced a variety of novel discoveries. Notwithstanding their usefulness, there is no single test that fits all needs, each suffering from specific drawbacks. Selecting the right aggregation test, while considering an unknown underlying genetic model of the disease, remains an important challenge. Here we propose a new ensemble method, called Excalibur, based on an optimal combination of 36 aggregation tests created after an in-depth study of the limitations of each test and their impact on the quality of result. Our findings demonstrate the ability of our method to control type I error and illustrate that it offers the best average power across all scenarios. The proposed method allows for novel advances in Whole Exome/Genome sequencing association studies, able to handle a wide range of association models, providing researchers with an optimal aggregation analysis for the genetic regions of interest. An increasing number of diseases previously thought to be caused by a mutation in a single gene are now being considered as involving several variants in a small number of genes (i.e. "oligogenic"). There is a limited number of dedicated bioinformatic tools to study such oligogenic causes of diseases. These include so called aggregation tests. Yet, an important challenge is to select the right aggregation test among the various ones that have been developed, as each suffers from different limitations. We have computationally compared 59 aggregation methods to explore their limitations. We found that combining 36 of them results in a more robust method, which we baptized "Excalibur". It can handle a wider range of hypotheses and case-control studies than any of the single methods, while reducing the number of false positive results. Excalibur also provides a comprehensive elucidation of the underlying genetic architecture pertaining to each genomic region under investigation. Thus, it provides a user-friendly, and statistically sound platform to study oligogenic inheritance with the increasing amount of available genetic data.
引用
收藏
页数:26
相关论文
共 28 条
  • [1] Rare-Variant Association Testing for Sequencing Data with the Sequence Kernel Association Test
    Wu, Michael C.
    Lee, Seunggeun
    Cai, Tianxi
    Li, Yun
    Boehnke, Michael
    Lin, Xihong
    AMERICAN JOURNAL OF HUMAN GENETICS, 2011, 89 (01) : 82 - 93
  • [2] A W-test collapsing method for rare-variant association testing in exome sequencing data
    Sun, Rui
    Weng, Haoyi
    Hu, Inchi
    Guo, Junfeng
    Wu, William K. K.
    Zee, Benny Chung-Ying
    Wang, Maggie Haitian
    GENETIC EPIDEMIOLOGY, 2016, 40 (07) : 591 - 596
  • [3] Optimal tests for rare variant effects in sequencing association studies
    Lee, Seunggeun
    Wu, Michael C.
    Lin, Xihong
    BIOSTATISTICS, 2012, 13 (04) : 762 - 775
  • [4] A Powerful Adaptive Cauchy-Variable Combination Method for Rare-Variant Association Analysis
    Y. Tang
    Y. Zhou
    L. Chen
    Y. Bao
    R. Zhang
    Russian Journal of Genetics, 2021, 57 : 238 - 245
  • [5] DYNATE: Localizing rare-variant association regions via multiple testing embedded in an aggregation tree
    Li, Xuechan
    Pura, John
    Allen, Andrew
    Owzar, Kouros
    Lu, Jianfeng
    Harms, Matthew
    Xie, Jichun
    GENETIC EPIDEMIOLOGY, 2024, 48 (01) : 42 - 55
  • [6] Beyond Rare-Variant Association Testing: Pinpointing Rare Causal Variants in Case-Control Sequencing Study
    Wan-Yu Lin
    Scientific Reports, 6
  • [7] Beyond Rare-Variant Association Testing: Pinpointing Rare Causal Variants in Case-Control Sequencing Study
    Lin, Wan-Yu
    SCIENTIFIC REPORTS, 2016, 6
  • [8] A Powerful Adaptive Cauchy-Variable Combination Method for Rare-Variant Association Analysis
    Tang, Y.
    Zhou, Y.
    Chen, L.
    Bao, Y.
    Zhang, R.
    RUSSIAN JOURNAL OF GENETICS, 2021, 57 (02) : 238 - 245
  • [9] ACAT: A Fast and Powerful p Value Combination Method for Rare-Variant Analysis in Sequencing Studies
    Liu, Yaowu
    Chen, Sixing
    Li, Zilin
    Morrison, Alanna C.
    Boerwinkle, Eric
    Lin, Xihong
    AMERICAN JOURNAL OF HUMAN GENETICS, 2019, 104 (03) : 410 - 421
  • [10] Family-based Bayesian collapsing method for rare-variant association study
    Liang He
    Janne M Pitkäniemi
    BMC Proceedings, 8 (Suppl 1)