Exploring the limit of using a deep neural network on pileup data for germline variant calling

被引:0
|
作者
Ruibang Luo
Chak-Lim Wong
Yat-Sing Wong
Chi-Ian Tang
Chi-Man Liu
Chi-Ming Leung
Tak-Wah Lam
机构
[1] The University of Hong Kong,Department of Computer Science
来源
Nature Machine Intelligence | 2020年 / 2卷
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Single-molecule sequencing technologies have emerged in recent years and revolutionized structural variant calling, complex genome assembly and epigenetic mark detection. However, the lack of a highly accurate small variant caller has limited these technologies from being more widely used. Here, we present Clair, the successor to Clairvoyante, a program for fast and accurate germline small variant calling, using single-molecule sequencing data. For Oxford Nanopore Technology data, Clair achieves better precision, recall and speed than several competing programs, including Clairvoyante, Longshot and Medaka. Through studying the missed variants and benchmarking intentionally overfitted models, we found that Clair may be approaching the limit of possible accuracy for germline small variant calling using pileup data and deep neural networks. Clair requires only a conventional central processing unit (CPU) for variant calling and is an open-source project available at https://github.com/HKU-BAL/Clair.
引用
收藏
页码:220 / 227
页数:7
相关论文
共 50 条
  • [1] Exploring the limit of using a deep neural network on pileup data for germline variant calling
    Luo, Ruibang
    Wong, Chak-Lim
    Wong, Yat-Sing
    Tang, Chi-Ian
    Liu, Chi-Man
    Leung, Chi-Ming
    Lam, Tak-Wah
    NATURE MACHINE INTELLIGENCE, 2020, 2 (04) : 220 - 227
  • [2] Improving variant calling using population data and deep learning
    Chen, Nae-Chyun
    Kolesnikov, Alexey
    Goel, Sidharth
    Yun, Taedong
    Chang, Pi-Chuan
    Carroll, Andrew
    BMC BIOINFORMATICS, 2023, 24 (01)
  • [3] Improving variant calling using population data and deep learning
    Nae-Chyun Chen
    Alexey Kolesnikov
    Sidharth Goel
    Taedong Yun
    Pi-Chuan Chang
    Andrew Carroll
    BMC Bioinformatics, 24
  • [4] Accuracy and efficiency of germline variant calling pipelines for human genome data
    Zhao, Sen
    Agafonov, Oleg
    Azab, Abdulrahman
    Stokowy, Tomasz
    Hovig, Eivind
    SCIENTIFIC REPORTS, 2020, 10 (01)
  • [5] Evaluation of Germline Structural Variant Calling Methods for Nanopore Sequencing Data
    Bolognini, Davide
    Magi, Alberto
    FRONTIERS IN GENETICS, 2021, 12
  • [6] Accuracy and efficiency of germline variant calling pipelines for human genome data
    Sen Zhao
    Oleg Agafonov
    Abdulrahman Azab
    Tomasz Stokowy
    Eivind Hovig
    Scientific Reports, 10
  • [7] A multi-task convolutional deep neural network for variant calling in single molecule sequencing
    Ruibang Luo
    Fritz J. Sedlazeck
    Tak-Wah Lam
    Michael C. Schatz
    Nature Communications, 10
  • [8] A multi-task convolutional deep neural network for variant calling in single molecule sequencing
    Luo, Ruibang
    Sedlazeck, Fritz J.
    Lam, Tak-Wah
    Schatz, Michael C.
    NATURE COMMUNICATIONS, 2019, 10 (1)
  • [9] Symphonizing pileup and full-alignment for deep learning-based long-read variant calling
    Zhenxian Zheng
    Shumin Li
    Junhao Su
    Amy Wing-Sze Leung
    Tak-Wah Lam
    Ruibang Luo
    Nature Computational Science, 2022, 2 : 797 - 803
  • [10] Symphonizing pileup and full-alignment for deep learning-based long-read variant calling
    Zheng, Zhenxian
    Li, Shumin
    Su, Junhao
    Leung, Amy Wing-Sze
    Lam, Tak-Wah
    Luo, Ruibang
    NATURE COMPUTATIONAL SCIENCE, 2022, 2 (12): : 797 - +