Single-Variant and Multi-Variant Trend Tests for Genetic Association with Next-Generation Sequencing That Are Robust to Sequencing Error

被引:5
|
作者
Kim, Wonkuk [1 ]
Londono, Douglas [2 ,3 ]
Zhou, Lisheng [2 ,3 ]
Xing, Jinchuan [2 ,3 ]
Nato, Alejandro Q. [2 ,3 ]
Musolf, Anthony [2 ,3 ]
Matise, Tara C. [2 ,3 ]
Finch, Stephen J. [4 ]
Gordon, Derek [2 ,3 ]
机构
[1] Univ S Florida, Dept Math & Stat, Tampa, FL USA
[2] Rutgers State Univ, Dept Genet, Piscataway, NJ 08854 USA
[3] Rutgers State Univ, Inst Human Genet, Piscataway, NJ 08854 USA
[4] SUNY Stony Brook, Dept Appl Math & Stat, Stony Brook, NY 11794 USA
基金
美国国家卫生研究院;
关键词
Next-generation sequencing; Rare variant; Trend test; Genetic association; GWAS; Allele; Locus; SNP GENOTYPING ERRORS; SAMPLE-SIZE; QUALITY-CONTROL; MISCLASSIFICATION; POWER; LINKAGE; DESIGN; CANCER; IMPACT; PHENOTYPE;
D O I
10.1159/000346824
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
As with any new technology, next-generation sequencing (NGS) has potential advantages and potential challenges. One advantage is the identification of multiple causal variants for disease that might otherwise be missed by SNP-chip technology. One potential challenge is misclassification error (as with any emerging technology) and the issue of power loss due to multiple testing. Here, we develop an extension of the linear trend test for association that incorporates differential misclassification error and may be applied to any number of SNPs. We call the statistic the linear trend test allowing for error, applied to NGS, or LTTae,NGS. This statistic allows for differential misclassification. The observed data are phenotypes for unrelated cases and controls, coverage, and the number of putative causal variants for every individual at all SNPs. We simulate data considering multiple factors (disease mode of inheritance, genotype relative risk, causal variant frequency, sequence error rate in cases, sequence error rate in controls, number of loci, and others) and evaluate type I error rate and power for each vector of factor settings. We compare our results with two recently published NGS statistics. Also, we create a fictitious disease model based on downloaded 1000 Genomes data for 5 SNPs and 388 individuals, and apply our statistic to those data. We find that the LTTae,NGS maintains the correct type I error rate in all simulations (differential and non-differential error), while the other statistics show large inflation in type I error for lower coverage. Power for all three methods is approximately the same for all three statistics in the presence of non-differential error. Application of our statistic to the 1000 Genomes data suggests that, for the data downloaded, there is a 1.5% sequence misclassification rate over all SNPs. Finally, application of the multi-variant form of LTTae,NGS shows high power for a number of simulation settings, although it can have lower power than the corresponding single-variant simulation results, most probably due to our specification of multivariant SNP correlation values. In conclusion, our LTTae,NGS addresses two key challenges with NGS disease studies; first, it allows for differential misclassification when computing the statistic; and second, it addresses the multiple-testing issue in that there is a multi-variant form of the statistic that has only one degree of freedom, and provides a single p value, no matter how many loci. Copyright (C) 2013 S. Karger AG, Basel
引用
收藏
页码:172 / 183
页数:12
相关论文
共 50 条
  • [1] Utilization of next-generation sequencing platforms in plant genomics and genetic variant discovery
    Deschamps, Stephane
    Campbell, Matthew A.
    [J]. MOLECULAR BREEDING, 2010, 25 (04) : 553 - 570
  • [2] 21.1 A Fully Integrated Genetic Variant Discovery SoC for Next-Generation Sequencing
    Wu, Yi-Chung
    Chen, Yen-Lung
    Yang, Chung-Hsuan
    Lee, Chao-Hsi
    Yu, Chao-Yang
    Chang, Nian-Shyang
    Chen, Ling-Chien
    Chang, Jia-Rong
    Lin, Chun-Pin
    Chen, Hung-Lieh
    Chen, Chi-Shi
    Hung, Jui-Hung
    Yang, Chia-Hsiang
    [J]. 2020 IEEE INTERNATIONAL SOLID- STATE CIRCUITS CONFERENCE (ISSCC), 2020, : 322 - +
  • [3] Utilization of next-generation sequencing platforms in plant genomics and genetic variant discovery
    Stéphane Deschamps
    Matthew A. Campbell
    [J]. Molecular Breeding, 2010, 25 : 553 - 570
  • [4] Next-generation sequencing in prenatal setting: Some examples of unexpected variant association
    Rinaldi, Berardo
    Race, Valerie
    Corveleyn, Anniek
    Van Hoof, Evelien
    Bauters, Marijke
    Van den Bogaert, Kris
    Denayer, Ellen
    de Ravel, Thomy
    Legius, Eric
    Baldewijns, Marcella
    Aertsen, Michael
    Lewi, Liesbeth
    De Catte, Luc
    Breckpot, Jeroen
    Devriendt, Koenraad
    [J]. EUROPEAN JOURNAL OF MEDICAL GENETICS, 2020, 63 (05)
  • [5] Validation and assessment of variant calling pipelines for next-generation sequencing
    Pirooznia, Mehdi
    Kramer, Melissa
    Parla, Jennifer
    Goes, Fernando S.
    Potash, James B.
    McCombie, W. Richard
    Zandi, Peter P.
    [J]. HUMAN GENOMICS, 2014, 8 : 14
  • [6] Validation and assessment of variant calling pipelines for next-generation sequencing
    Mehdi Pirooznia
    Melissa Kramer
    Jennifer Parla
    Fernando S Goes
    James B Potash
    W Richard McCombie
    Peter P Zandi
    [J]. Human Genomics, 8
  • [7] Variant Callers for Next-Generation Sequencing Data: A Comparison Study
    Liu, Xiangtao
    Han, Shizhong
    Wang, Zuoheng
    Gelernter, Joel
    Yang, Bao-Zhu
    [J]. PLOS ONE, 2013, 8 (09):
  • [8] Rare Variant Association Testing for Next-Generation Sequencing Data via Hierarchical Clustering
    Tachmazidou, Ioanna
    Morris, Andrew
    Zeggini, Eleftheria
    [J]. HUMAN HEREDITY, 2012, 74 (3-4) : 165 - 171
  • [9] A Rare Variant Association Test Based on Combinations of Single-Variant Tests
    Sha, Qiuying
    Zhang, Shuanglin
    [J]. GENETIC EPIDEMIOLOGY, 2014, 38 (06) : 494 - 501
  • [10] Empirical Bayes single nucleotide variant-calling for next-generation sequencing data
    Karimnezhad, Ali
    Perkins, Theodore J.
    [J]. SCIENTIFIC REPORTS, 2024, 14 (01)