Polishing copy number variant calls on exome sequencing data via deep learning

被引:6
|
作者
Ozden, Furkan [1 ]
Alkan, Can [1 ]
Cicek, A. Ercument [1 ,2 ]
机构
[1] Bilkent Univ, Dept Comp Engn, TR-06800 Ankara, Turkey
[2] Carnegie Mellon Univ, Computat Biol Dept, Pittsburgh, PA 15213 USA
关键词
WHOLE-GENOME;
D O I
10.1101/gr.274845.120
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Accurate and efficient detection of copy number variants (CNVs) is of critical importance owing to their significant association with complex genetic diseases. Although algorithms that use whole-genome sequencing (WGS) data provide stable results with mostly valid statistical assumptions, copy number detection on whole-exome sequencing (WES) data shows comparatively lower accuracy. This is unfortunate as WES data are cost-efficient, compact, and relatively ubiquitous. The bottleneck is primarily due to the noncontiguous nature of the targeted capture: biases in targeted genomic hybridization, GC content, targeting probes, and sample batching during sequencing. Here, we present a novel deep learning model, DECoNT, which uses the matched WES and WGS data, and learns to correct the copy number variations reported by any off-the-shelf WES-based germline CNV caller. We train DECoNT on the 1000 Genomes Project data, and we show that we can efficiently triple the duplication call precision and double the deletion call precision of the state-of-the-art algorithms. We also show that our model consistently improves the performance independent of (1) sequencing technology, (2) exome capture kit, and (3) CNV caller. Using DECoNT as a universal exome CNV call polisher has the potential to improve the reliability of germline CNV detection on WES data sets.
引用
收藏
页码:1170 / 1182
页数:13
相关论文
共 50 条
  • [21] EXCAVATOR: detecting copy number variants from whole-exome sequencing data
    Alberto Magi
    Lorenzo Tattini
    Ingrid Cifola
    Romina D’Aurizio
    Matteo Benelli
    Eleonora Mangano
    Cristina Battaglia
    Elena Bonora
    Ants Kurg
    Marco Seri
    Pamela Magini
    Betti Giusti
    Giovanni Romeo
    Tommaso Pippucci
    Gianluca De Bellis
    Rosanna Abbate
    Gian Franco Gensini
    Genome Biology, 14
  • [22] Identification of copy number variants relevant to primary immunodeficiency from exome sequencing data
    Wan, Rensheng
    Schieck, Maximilian
    Hofmann, Winfried
    Knopf, Philipp H. B.
    Proietti, Michele
    de Oteyza, Andres Caballero Garcia
    Illig, Thomas
    Grimbacher, Bodo
    Steinemann, Doris
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2022, 30 (SUPPL 1) : 225 - 226
  • [23] Evaluation of somatic copy number estimation tools for whole-exome sequencing data
    Nam, Jae-Yong
    Kim, Nayoung K. D.
    Kim, Sang Cheol
    Joung, Je-Gun
    Xi, Ruibin
    Lee, Semin
    Park, Peter J.
    Park, Woong-Yang
    BRIEFINGS IN BIOINFORMATICS, 2016, 17 (02) : 185 - 192
  • [24] Evaluation of Copy Number Variation (CNV) detection methods in whole exome sequencing data
    Zhang, Peng
    Ling, Hua
    Pugh, Elizabeth
    Hetrick, Kurt
    Witmer, Dane
    Sobreira, Nara
    Valle, David
    Doheny, Kimberly
    GENETIC EPIDEMIOLOGY, 2015, 39 (07) : 597 - 597
  • [25] EXCAVATOR: detecting copy number variants from whole-exome sequencing data
    Magi, Alberto
    Tattini, Lorenzo
    Cifola, Ingrid
    D'Aurizio, Romina
    Benelli, Matteo
    Mangano, Eleonora
    Battaglia, Cristina
    Bonora, Elena
    Kurg, Ants
    Seri, Marco
    Magini, Pamela
    Giusti, Betti
    Romeo, Giovanni
    Pippucci, Tommaso
    De Bellis, Gianluca
    Abbate, Rosanna
    Gensini, Gian Franco
    GENOME BIOLOGY, 2013, 14 (10):
  • [26] Clinically relevant copy-number variants in exome sequencing data of patients with dystonia
    Zech, Michael
    Boesch, Sylvia
    Skorvanek, Matej
    Necpal, Jan
    Svantnerova, Jana
    Wagner, Matias
    Dincer, Yasemin
    Sadr-Nabavi, Ariane
    Serranova, Teresa
    Rektorova, Irena
    Havrankova, Petra
    Ganai, Shahzaman
    Mosejova, Alexandra
    Prihodova, Iva
    Sarlakova, Jana
    Kulcsarova, Kristina
    Ulmanova, Olga
    Bechyne, Karel
    Ostrozovicova, Miriam
    Han, Vladimir
    Ventosa, Joaquim Ribeiro
    Shariati, Mohammad
    Shoeibi, Ali
    Weber, Sandrina
    Mollenhauer, Brit
    Trenkwalder, Claudia
    Berutti, Riccardo
    Strom, Tim M.
    Ceballos-Baumann, Andres
    Mall, Volker
    Haslinger, Bernhard
    Jech, Robert
    Winkelmann, Juliane
    PARKINSONISM & RELATED DISORDERS, 2021, 84 : 129 - 134
  • [27] CANOES: detecting rare copy number variants from whole exome sequencing data
    Backenroth, Daniel
    Homsy, Jason
    Murillo, Laura R.
    Glessner, Joe
    Lin, Edwin
    Brueckner, Martina
    Lifton, Richard
    Goldmuntz, Elizabeth
    Chung, Wendy K.
    Shen, Yufeng
    NUCLEIC ACIDS RESEARCH, 2014, 42 (12)
  • [28] Erratum to: CoNVEX: copy number variation estimation in exome sequencing data using HMM
    Kaushalya C Amarasinghe
    Jason Li
    Saman K Halgamuge
    BMC Bioinformatics, 14 (Suppl 2)
  • [29] A Sparse Model Based Detection of Copy Number Variations From Exome Sequencing Data
    Duan, Junbo
    Wan, Mingxi
    Deng, Hong-Wen
    Wang, Yu-Ping
    IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2016, 63 (03) : 496 - 505
  • [30] PatternCNV: a versatile tool for detecting copy number changes from exome sequencing data
    Wang, Chen
    Evans, Jared M.
    Bhagwate, Aditya V.
    Prodduturi, Naresh
    Sarangi, Vivekananda
    Middha, Mridu
    Sicotte, Hugues
    Vedell, Peter T.
    Hart, Steven N.
    Oliver, Gavin R.
    Kocher, Jean-Pierre A.
    Maurer, Matthew J.
    Novak, Anne J.
    Slager, Susan L.
    Cerhan, James R.
    Asmann, Yan W.
    BIOINFORMATICS, 2014, 30 (18) : 2678 - 2680