Fast detection of de novo copy number variants from SNP arrays for case-parent trios

被引:9
|
作者
Scharpf, Robert B. [1 ]
Beaty, Terri H. [2 ]
Schwender, Holger [3 ]
Younkin, Samuel G. [5 ]
Scott, Alan F. [4 ]
Ruczinski, Ingo [5 ]
机构
[1] Johns Hopkins Univ, Dept Oncol, Baltimore, MD USA
[2] Johns Hopkins Bloomberg Sch Publ Hlth, Dept Epidemiol, Baltimore, MD USA
[3] Univ Dusseldorf, Math Inst, D-40225 Dusseldorf, Germany
[4] Johns Hopkins Univ, Dept Med, Baltimore, MD USA
[5] Johns Hopkins Bloomberg Sch Publ Hlth, Dept Biostat, Baltimore, MD USA
来源
BMC BIOINFORMATICS | 2012年 / 13卷
基金
美国国家卫生研究院;
关键词
Trios; Oral cleft; Copy number variants; de novo; High-throughput arrays; Segmentation; batch effects; Genomic waves; HIDDEN MARKOV-MODELS; CIRCULAR BINARY SEGMENTATION; DIGEORGE-SYNDROME; STATISTICAL APPROACH; CGH DATA; DELETION; ASSOCIATION; ABERRATIONS; INTENSITIES; ALGORITHM;
D O I
10.1186/1471-2105-13-330
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: In studies of case-parent trios, we define copy number variants (CNVs) in the offspring that differ from the parental copy numbers as de novo and of interest for their potential functional role in disease. Among the leading array-based methods for discovery of de novo CNVs in case-parent trios is the joint hidden Markov model (HMM) implemented in the PennCNV software. However, the computational demands of the joint HMM are substantial and the extent to which false positive identifications occur in case-parent trios has not been well described. We evaluate these issues in a study of oral cleft case-parent trios. Results: Our analysis of the oral cleft trios reveals that genomic waves represent a substantial source of false positive identifications in the joint HMM, despite a wave-correction implementation in PennCNV. In addition, the noise of low-level summaries of relative copy number (log R ratios) is strongly associated with batch and correlated with the frequency of de novo CNV calls. Exploiting the trio design, we propose a univariate statistic for relative copy number referred to as the minimum distance that can reduce technical variation from probe effects and genomic waves. We use circular binary segmentation to segment the minimum distance and maximum a posteriori estimation to infer de novo CNVs from the segmented genome. Compared to PennCNV on simulated data, MinimumDistance identifies fewer false positives on average and is comparable to PennCNV with respect to false negatives. Genomic waves contribute to discordance of PennCNV and MinimumDistance for high coverage de novo calls, while highly concordant calls on chromosome 22 were validated by quantitative PCR. Computationally, MinimumDistance provides a nearly 8-fold increase in speed relative to the joint HMM in a study of oral cleft trios. Conclusions: Our results indicate that batch effects and genomic waves are important considerations for case-parent studies of de novo CNV, and that the minimum distance is an effective statistic for reducing technical variation contributing to false de novo discoveries. Coupled with segmentation and maximum a posteriori estimation, our algorithm compares favorably to the joint HMM with MinimumDistance being much faster.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Fast detection of de novo copy number variants from SNP arrays for case-parent trios
    Robert B Scharpf
    Terri H Beaty
    Holger Schwender
    Samuel G Younkin
    Alan F Scott
    Ingo Ruczinski
    [J]. BMC Bioinformatics, 13
  • [2] Detection of SNP-SNP Interactions in Case-Parent Trios
    Ruczinski, Ingo
    Li, Qing
    Louis, Thomas A.
    Fallin, Daniele
    Pulver, Ann E.
    [J]. GENETIC EPIDEMIOLOGY, 2009, 33 (08) : 769 - 769
  • [3] Detection of de novo copy number deletions from targeted sequencing of trios
    Fu, Jack M.
    Leslie, Elizabeth J.
    Scott, Alan F.
    Murray, Jeffrey C.
    Marazita, Mary L.
    Beaty, Terri H.
    Scharpf, Robert B.
    Ruczinski, Ingo
    [J]. BIOINFORMATICS, 2019, 35 (04) : 571 - 578
  • [4] Detection of De Novo Copy Number Deletions from Targeted Sequencing of Trios
    Fu, J.
    Leslie, E.
    Scott, A.
    Murray, J.
    Marazita, M.
    Beaty, T.
    Scharpf, R.
    Ruczinski, I.
    [J]. HUMAN HEREDITY, 2017, 83 (01) : 43 - 43
  • [5] Using case-parent trios to look for rare de novo genetic variants in adult-onset neurodegenerative diseases
    Pamphlett, Roger
    Morahan, Julia M.
    Yu, Bing
    [J]. JOURNAL OF NEUROSCIENCE METHODS, 2011, 197 (02) : 297 - 301
  • [6] Evidence for SNP-SNP interaction identified through targeted sequencing of cleft case-parent trios
    Xiao, Yanzi
    Taub, Margaret A.
    Ruczinski, Ingo
    Begum, Ferdouse
    Hetmanski, Jacqueline B.
    Schwender, Holger
    Leslie, Elizabeth J.
    Koboldt, Daniel C.
    Murray, Jeffrey C.
    Marazita, Mary L.
    Beaty, Terri H.
    [J]. GENETIC EPIDEMIOLOGY, 2017, 41 (03) : 244 - 250
  • [7] Detecting multiple variants associated with disease based on sequencing data of case-parent trios
    Wang, Chan
    Sun, Leiming
    Zheng, Haitao
    Hu, Yue-Qing
    [J]. JOURNAL OF HUMAN GENETICS, 2016, 61 (10) : 851 - 860
  • [8] Joint detection of copy number variations in parent-offspring trios
    Liu, Yongzhuang
    Liu, Jian
    Lu, Jianguo
    Peng, Jiajie
    Juan, Liran
    Zhu, Xiaolin
    Li, Bingshan
    Wang, Yadong
    [J]. BIOINFORMATICS, 2016, 32 (08) : 1130 - 1137
  • [9] DE NOVO COPY NUMBER VARIANTS AND PARENTAL AGE
    Graham, J. M., Jr.
    Wadharwan, I
    Foyouzi, N.
    Hai, Y.
    Guo, X.
    Rosenberg, J.
    [J]. AMERICAN JOURNAL OF MEDICAL GENETICS PART A, 2018, 176 (06) : 1496 - 1497
  • [10] De novo copy number variants and parental age
    Foyouzi, N.
    Wadhawan, I.
    Hai, Y.
    Guo, X.
    Graham, J. M., Jr.
    Rosenberg, J.
    [J]. EUROPEAN JOURNAL OF HUMAN GENETICS, 2018, 26 : 125 - 126