Statistical challenges associated with detecting copy number variations with next-generation sequencing

被引:161
|
作者
Teo, Shu Mei [1 ,2 ,3 ]
Pawitan, Yudi [3 ]
Ku, Chee Seng [3 ]
Chia, Kee Seng [1 ,2 ]
Salim, Agus [1 ]
机构
[1] Natl Univ Singapore, Saw Swee Hock Sch Publ Hlth, Singapore 117597, Singapore
[2] Natl Univ Singapore, NUS Grad Sch Integrat Sci & Engn, Singapore 117456, Singapore
[3] Karolinska Inst, Dept Med Epidemiol & Biostat, S-17177 Stockholm, Sweden
关键词
STRUCTURAL VARIATION; PAIRED-END; SHORT-READ; ALGORITHMS; VARIANTS; MODEL; TOOL;
D O I
10.1093/bioinformatics/bts535
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Analysing next-generation sequencing (NGS) data for copy number variations (CNVs) detection is a relatively new and challenging field, with no accepted standard protocols or quality control measures so far. There are by now several algorithms developed for each of the four broad methods for CNV detection using NGS, namely the depth of coverage (DOC), read-pair, split-read and assembly-based methods. However, because of the complexity of the genome and the short read lengths from NGS technology, there are still many challenges associated with the analysis of NGS data for CNVs, no matter which method or algorithm is used. Results: In this review, we describe and discuss areas of potential biases in CNV detection for each of the four methods. In particular, we focus on issues pertaining to (i) mappability, (ii) GC-content bias, (iii) quality control measures of reads and (iv) difficulty in identifying duplications. To gain insights to some of the issues discussed, we also download real data from the 1000 Genomes Project and analyse its DOC data. We show examples of how reads in repeated regions can affect CNV detection, demonstrate current GC-correction algorithms, investigate sensitivity of DOC algorithm before and after quality control of reads and discuss reasons for which duplications are harder to detect than deletions.
引用
收藏
页码:2711 / 2718
页数:8
相关论文
共 50 条
  • [1] CNV-PCC: An efficient method for detecting copy number variations from next-generation sequencing data
    Zhang, Tong
    Dong, Jinxin
    Jiang, Hua
    Zhao, Zuyao
    Zhou, Mengjiao
    Yuan, Tianting
    FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY, 2022, 10
  • [2] CNV-PCC: An efficient method for detecting copy number variations from next-generation sequencing data
    Zhang, Tong
    Dong, Jinxin
    Jiang, Hua
    Zhao, Zuyao
    Zhou, Mengjiao
    Yuan, Tianting
    Frontiers in Bioengineering and Biotechnology, 2022, 10
  • [3] Computational methods for detecting copy number variations in cancer genome using next generation sequencing: principles and challenges
    Liu, Biao
    Morrison, Carl D.
    Johnson, Candace S.
    Trump, Donald L.
    Qin, Maochun
    Conroy, Jeffrey C.
    Wang, Jianmin
    Liu, Song
    ONCOTARGET, 2013, 4 (11) : 1868 - 1881
  • [4] A Deep Learning Approach for Detecting Copy Number Variation in Next-Generation Sequencing Data
    Hill, Tom
    Unckless, Robert L.
    G3-GENES GENOMES GENETICS, 2019, 9 (11): : 3575 - 3582
  • [5] Identification of copy number variations associated with congenital heart disease by chromosomal microarray analysis and next-generation sequencing
    Zhu, Xiangyu
    Li, Jie
    Ru, Tong
    Wang, Yaping
    Xu, Yan
    Yang, Ying
    Wu, Xing
    Cram, David S.
    Hu, Yali
    PRENATAL DIAGNOSIS, 2016, 36 (04) : 321 - 327
  • [6] Detecting copy number variations in routine diagnostic samples using next generation sequencing data
    Singh, Ashish Kumar
    Johansen, Jostein
    Ravi, Anuradha
    Misund, Kristine
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2024, 32 : 639 - 639
  • [7] Copy Number Variation Detection Using Next-Generation Sequencing
    Baughn, Linda B.
    Onsongo, Getiria
    Bower, Matthew
    Henzler, Christine
    Silverstein, Kevin A. T.
    Thyagarajan, Bharat
    AMERICAN JOURNAL OF CLINICAL PATHOLOGY, 2015, 143 : A13 - A13
  • [8] Detection of Significant Copy Number Variations From Multiple Samples in Next-Generation Sequencing Data
    Yuan, Xiguo
    Zhang, Junying
    Yang, Liying
    Bai, Jun
    Fan, Peizhen
    IEEE TRANSACTIONS ON NANOBIOSCIENCE, 2018, 17 (01) : 12 - 20
  • [9] SeqCNV: a novel method for identification of copy number variations in targeted next-generation sequencing data
    Yong Chen
    Li Zhao
    Yi Wang
    Ming Cao
    Violet Gelowani
    Mingchu Xu
    Smriti A. Agrawal
    Yumei Li
    Stephen P. Daiger
    Richard Gibbs
    Fei Wang
    Rui Chen
    BMC Bioinformatics, 18
  • [10] USE OF NEXT-GENERATION SEQUENCING TO DETECT COPY NUMBER VARIATIONS IN THE MOLECULAR DIAGNOSIS OF FAMILIAL HYPERCHOLESTEROLEMIA
    Iacocca, Michael
    Wang, Jian
    Dron, Jacqueline
    Robinson, John
    Mcintyre, Adam
    Cao, Henian
    Hegele, Robert
    ATHEROSCLEROSIS, 2017, 263 : E236 - E236