Polishing copy number variant calls on exome sequencing data via deep learning

被引:6
|
作者
Ozden, Furkan [1 ]
Alkan, Can [1 ]
Cicek, A. Ercument [1 ,2 ]
机构
[1] Bilkent Univ, Dept Comp Engn, TR-06800 Ankara, Turkey
[2] Carnegie Mellon Univ, Computat Biol Dept, Pittsburgh, PA 15213 USA
关键词
WHOLE-GENOME;
D O I
10.1101/gr.274845.120
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Accurate and efficient detection of copy number variants (CNVs) is of critical importance owing to their significant association with complex genetic diseases. Although algorithms that use whole-genome sequencing (WGS) data provide stable results with mostly valid statistical assumptions, copy number detection on whole-exome sequencing (WES) data shows comparatively lower accuracy. This is unfortunate as WES data are cost-efficient, compact, and relatively ubiquitous. The bottleneck is primarily due to the noncontiguous nature of the targeted capture: biases in targeted genomic hybridization, GC content, targeting probes, and sample batching during sequencing. Here, we present a novel deep learning model, DECoNT, which uses the matched WES and WGS data, and learns to correct the copy number variations reported by any off-the-shelf WES-based germline CNV caller. We train DECoNT on the 1000 Genomes Project data, and we show that we can efficiently triple the duplication call precision and double the deletion call precision of the state-of-the-art algorithms. We also show that our model consistently improves the performance independent of (1) sequencing technology, (2) exome capture kit, and (3) CNV caller. Using DECoNT as a universal exome CNV call polisher has the potential to improve the reliability of germline CNV detection on WES data sets.
引用
收藏
页码:1170 / 1182
页数:13
相关论文
共 50 条
  • [1] Accurate in silico confirmation of rare copy number variant calls from exome sequencing data using transfer learning
    Tan, Renjie
    Shen, Yufeng
    NUCLEIC ACIDS RESEARCH, 2022, 50 (21) : E123
  • [2] ECOLE: Learning to call copy number variants on whole exome sequencing data
    Mandiracioglu, Berk
    Ozden, Furkan
    Kaynar, Gun
    Yilmaz, Mehmet Alper
    Alkan, Can
    Cicek, A. Ercument
    NATURE COMMUNICATIONS, 2024, 15 (01)
  • [3] ECOLE: Learning to call copy number variants on whole exome sequencing data
    Berk Mandiracioglu
    Furkan Ozden
    Gun Kaynar
    Mehmet Alper Yilmaz
    Can Alkan
    A. Ercument Cicek
    Nature Communications, 15
  • [4] isoCNV: in silico optimization of copy number variant detection from targeted or exome sequencing data
    Barcelona-Cabeza, Rosa
    Sanseverino, Walter
    Aiese Cigliano, Riccardo
    BMC BIOINFORMATICS, 2021, 22 (01)
  • [5] isoCNV: in silico optimization of copy number variant detection from targeted or exome sequencing data
    Rosa Barcelona-Cabeza
    Walter Sanseverino
    Riccardo Aiese Cigliano
    BMC Bioinformatics, 22
  • [6] Increasing the diagnostic yield of exome sequencing by copy number variant analysis
    Marchuk, Daniel S.
    Crooks, Kristy
    Strande, Natasha
    Kaiser-Rogers, Kathleen
    Milko, Laura, V
    Brandt, Alicia
    Arreola, Alexandra
    Tilley, Christian R.
    Bizon, Chris
    Vora, Neeta L.
    Wilhelmsen, Kirk C.
    Evans, James P.
    Berg, Jonathan S.
    PLOS ONE, 2018, 13 (12):
  • [7] ERDS-Exome: A Hybrid Approach for Copy Number Variant Detection from Whole-Exome Sequencing Data
    Tan, Renjie
    Wang, Jixuan
    Wu, Xiaoliang
    Juan, Liran
    Zhang, Tianjiao
    Ma, Rui
    Zhan, Qing
    Wang, Tao
    Jin, Shuilin
    Jiang, Qinghua
    Wang, Yadong
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2020, 17 (03) : 796 - 803
  • [8] Novel mutation identification and copy number variant detection via exome sequencing in congenital muscular dystrophy
    Cauley, Edmund S.
    Pittman, Alan
    Mummidivarpu, Swati
    Karimiani, Ehsan G.
    Martinez, Samantha
    Moroni, Isabella
    Boostani, Reza
    Podini, Daniele
    Mora, Marina
    Jamshidi, Yalda
    Hoffman, Eric P.
    Manzini, M. Chiara
    MOLECULAR GENETICS & GENOMIC MEDICINE, 2020, 8 (11):
  • [9] Estimation of Copy Number Alterations from Exome Sequencing Data
    Valdes-Mas, Rafael
    Bea, Silvia
    Puente, Diana A.
    Lopez-Otin, Carlos
    Puente, Xose S.
    PLOS ONE, 2012, 7 (12):
  • [10] A robust model for read count data in exome sequencing experiments and implications for copy number variant calling
    Plagnol, Vincent
    Curtis, James
    Epstein, Michael
    Mok, Kin Y.
    Stebbings, Emma
    Grigoriadou, Sofia
    Wood, Nicholas W.
    Hambleton, Sophie
    Burns, Siobhan O.
    Thrasher, Adrian J.
    Kumararatne, Dinakantha
    Doffinger, Rainer
    Nejentsev, Sergey
    BIOINFORMATICS, 2012, 28 (21) : 2747 - 2754