Single-Variant and Multi-Variant Trend Tests for Genetic Association with Next-Generation Sequencing That Are Robust to Sequencing Error

被引:5
|
作者
Kim, Wonkuk [1 ]
Londono, Douglas [2 ,3 ]
Zhou, Lisheng [2 ,3 ]
Xing, Jinchuan [2 ,3 ]
Nato, Alejandro Q. [2 ,3 ]
Musolf, Anthony [2 ,3 ]
Matise, Tara C. [2 ,3 ]
Finch, Stephen J. [4 ]
Gordon, Derek [2 ,3 ]
机构
[1] Univ S Florida, Dept Math & Stat, Tampa, FL USA
[2] Rutgers State Univ, Dept Genet, Piscataway, NJ 08854 USA
[3] Rutgers State Univ, Inst Human Genet, Piscataway, NJ 08854 USA
[4] SUNY Stony Brook, Dept Appl Math & Stat, Stony Brook, NY 11794 USA
基金
美国国家卫生研究院;
关键词
Next-generation sequencing; Rare variant; Trend test; Genetic association; GWAS; Allele; Locus; SNP GENOTYPING ERRORS; SAMPLE-SIZE; QUALITY-CONTROL; MISCLASSIFICATION; POWER; LINKAGE; DESIGN; CANCER; IMPACT; PHENOTYPE;
D O I
10.1159/000346824
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
As with any new technology, next-generation sequencing (NGS) has potential advantages and potential challenges. One advantage is the identification of multiple causal variants for disease that might otherwise be missed by SNP-chip technology. One potential challenge is misclassification error (as with any emerging technology) and the issue of power loss due to multiple testing. Here, we develop an extension of the linear trend test for association that incorporates differential misclassification error and may be applied to any number of SNPs. We call the statistic the linear trend test allowing for error, applied to NGS, or LTTae,NGS. This statistic allows for differential misclassification. The observed data are phenotypes for unrelated cases and controls, coverage, and the number of putative causal variants for every individual at all SNPs. We simulate data considering multiple factors (disease mode of inheritance, genotype relative risk, causal variant frequency, sequence error rate in cases, sequence error rate in controls, number of loci, and others) and evaluate type I error rate and power for each vector of factor settings. We compare our results with two recently published NGS statistics. Also, we create a fictitious disease model based on downloaded 1000 Genomes data for 5 SNPs and 388 individuals, and apply our statistic to those data. We find that the LTTae,NGS maintains the correct type I error rate in all simulations (differential and non-differential error), while the other statistics show large inflation in type I error for lower coverage. Power for all three methods is approximately the same for all three statistics in the presence of non-differential error. Application of our statistic to the 1000 Genomes data suggests that, for the data downloaded, there is a 1.5% sequence misclassification rate over all SNPs. Finally, application of the multi-variant form of LTTae,NGS shows high power for a number of simulation settings, although it can have lower power than the corresponding single-variant simulation results, most probably due to our specification of multivariant SNP correlation values. In conclusion, our LTTae,NGS addresses two key challenges with NGS disease studies; first, it allows for differential misclassification when computing the statistic; and second, it addresses the multiple-testing issue in that there is a multi-variant form of the statistic that has only one degree of freedom, and provides a single p value, no matter how many loci. Copyright (C) 2013 S. Karger AG, Basel
引用
收藏
页码:172 / 183
页数:12
相关论文
共 50 条
  • [41] Recommendations for Next-Generation Sequencing Germline Variant Confirmation A Joint Report of the Association for Molecular Pathology and National Society of Genetic Counselors
    Crooks, Kristy R.
    Hagman, Kelly D. Farwell
    Mandelker, Diana
    Santani, Avni
    Schmidt, Ryan J.
    Temple-Smolkin, Robyn L.
    Lincoln, Stephen E.
    [J]. JOURNAL OF MOLECULAR DIAGNOSTICS, 2023, 25 (07): : 411 - 427
  • [42] Variant mapping and mutation discovery in inbred mice using next-generation sequencing
    Gallego-Llamas, Jabier
    Timms, Andrew E.
    Geister, Krista A.
    Lindsay, Anna
    Beier, David R.
    [J]. BMC GENOMICS, 2015, 16
  • [43] Genomic characterization of a novel avian arthritis orthoreovirus variant by next-generation sequencing
    Tang, Yi
    Lu, Huaguang
    [J]. ARCHIVES OF VIROLOGY, 2015, 160 (10) : 2629 - 2632
  • [44] Reproducibility of Variant Allele Frequency in Targeted Cancer Next-Generation Sequencing Assays
    Meyer, Anders
    Sussman, Robyn
    Bigdeli, Ashkan
    Rosenbaum, Jason
    [J]. MODERN PATHOLOGY, 2018, 31 : 783 - 783
  • [45] VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research
    Lai, Zhongwu
    Markovets, Aleksandra
    Ahdesmaki, Miika
    Chapman, Brad
    Hofmann, Oliver
    McEwen, Robert
    Johnson, Justin
    Dougherty, Brian
    Barrett, J. Carl
    Dry, Jonathan R.
    [J]. NUCLEIC ACIDS RESEARCH, 2016, 44 (11)
  • [46] Genomic characterization of a novel avian arthritis orthoreovirus variant by next-generation sequencing
    Yi Tang
    Huaguang Lu
    [J]. Archives of Virology, 2015, 160 : 2629 - 2632
  • [47] Gene-set association tests for next-generation sequencing data
    Lee, Jaehoon
    Kim, Young Jin
    Lee, Juyoung
    Kim, Bong-Jo
    Lee, Seungyeoun
    Park, Taesung
    [J]. BIOINFORMATICS, 2016, 32 (17) : 611 - 619
  • [48] Next-generation sequencing tests to become routine
    Ratner, Mark
    [J]. NATURE BIOTECHNOLOGY, 2018, 36 (06) : 484 - 484
  • [49] Next-generation sequencing tests to become routine
    Mark Ratner
    [J]. Nature Biotechnology, 2018, 36 : 484 - 484
  • [50] Analysis of indel and structural variant error profiles in deep next generation sequencing data
    Shao, Ying
    Tran, Quang
    Kolekar, Pandurang
    Liu, Yanling
    McBride, Andrea
    Jones, Tyler
    Mulder, Heather
    Ji, Lingyun
    Huang, Benjamin
    Meshinchi, Soheil
    Klco, Jeffery
    Zhang, Jinghui
    Carroll, William
    Loh, Mignon
    Brown, Patrick
    Easton, John
    Ma, Xiaotu
    [J]. CANCER RESEARCH, 2023, 83 (08)