xAtlas: scalable small variant calling across heterogeneous next-generation sequencing experiments

被引:3
|
作者
Farek, Jesse [1 ]
Hughes, Daniel [1 ,2 ]
Salerno, William [1 ,3 ]
Zhu, Yiming [1 ]
Pisupati, Aishwarya [1 ]
Mansfield, Adam [1 ,3 ]
Krasheninina, Olga [1 ,3 ]
English, Adam C. [1 ]
Metcalf, Ginger [1 ]
Boerwinkle, Eric [1 ,4 ]
Muzny, Donna M. [1 ]
Gibbs, Richard [1 ]
Khan, Ziad [1 ]
Sedlazeck, Fritz J. [1 ]
机构
[1] Baylor Coll Med, Human Genome Sequencing Ctr, One Baylor Plaza, Houston, TX 77030 USA
[2] Columbia Univ, Inst Genom Med, New York, NY USA
[3] Regeneron Pharmaceut Inc, Tarrytown, NY USA
[4] Univ Texas Hlth Sci Ctr Houston, Human Genet Ctr, El Paso, TX USA
来源
GIGASCIENCE | 2023年 / 12卷
关键词
GENOTYPE; GENOMES; SNP;
D O I
10.1093/gigascience/giac125
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: The growing volume and heterogeneity of next-generation sequencing (NGS) data complicate the further optimization of identifying DNA variation, especially considering that curated high-confidence variant call sets frequently used to validate these methods are generally developed from the analysis of comparatively small and homogeneous sample sets. Findings: We have developed xAtlas, a single-sample variant caller for single-nucleotide variants (SNVs) and small insertions and deletions (indels) in NGS data. xAtlas features rapid runtimes, support for CRAM and gVCF file formats, and retraining capabilities. xAtlas reports SNVs with 99.11% recall and 98.43% precision across a reference HG002 sample at 60x whole-genome coverage in less than 2 CPU hours. Applying xAtlas to 3,202 samples at 30x whole-genome coverage from the 1000 Genomes Project achieves an average runtime of 1.7 hours per sample and a clear separation of the individual populations in principal component analysis across called SNVs. Conclusions: xAtlas is a fast, lightweight, and accurate SNV and small indel calling method. Source code for xAtlas is available under a BSD 3-clause license at https://github.com/jfarek/xatlas.
引用
收藏
页数:7
相关论文
共 50 条
  • [42] Reproducibility of Variant Calls in Replicate Next Generation Sequencing Experiments
    Qi, Yuan
    Liu, Xiuping
    Liu, Chang-gong
    Wang, Bailing
    Hess, Kenneth R.
    Symmans, W. Fraser
    Shi, Weiwei
    Pusztai, Lajos
    [J]. PLOS ONE, 2015, 10 (07):
  • [43] Unbiased machine learning methods to predict the limitations of variant calling in homologous genomic regions using next-generation sequencing
    Li, Feng
    Gnanaolivu, Rohan
    Vidal-Folch, Noemi
    Saha, Neiladri
    Mistry, Nipun
    Blake, Emily
    Niu, Zhiyv
    McClelland, Shawn
    Oglesbee, Devin
    Wang, Chen
    [J]. MOLECULAR GENETICS AND METABOLISM, 2021, 132 : S250 - S252
  • [44] APPLICATIONS OF NEXT-GENERATION SEQUENCING Sequencing technologies - the next generation
    Metzker, Michael L.
    [J]. NATURE REVIEWS GENETICS, 2010, 11 (01) : 31 - 46
  • [45] Comparison of seven SNP calling pipelines for the next-generation sequencing data of chickens
    Liu, Jing
    Shen, Qingmiao
    Bao, Haigang
    [J]. PLOS ONE, 2022, 17 (01):
  • [46] Genotype calling and phasing using next-generation sequencing reads and a haplotype scaffold
    Menelaou, Androniki
    Marchini, Jonathan
    [J]. BIOINFORMATICS, 2013, 29 (01) : 84 - 91
  • [47] SeqEM: an adaptive genotype-calling approach for next-generation sequencing studies
    Martin, E. R.
    Kinnamon, D. D.
    Schmidt, M. A.
    Powell, E. H.
    Zuchner, S.
    Morris, R. W.
    [J]. BIOINFORMATICS, 2010, 26 (22) : 2803 - 2810
  • [48] Comparison of insertion/deletion calling algorithms on human next-generation sequencing data
    Ghoneim D.H.
    Myers J.R.
    Tuttle E.
    Paciorkowski A.R.
    [J]. BMC Research Notes, 7 (1)
  • [49] Benchmarking variant callers in next-generation and third-generation sequencing analysis
    Pei, Surui
    Liu, Tao
    Ren, Xue
    Li, Weizhong
    Chen, Chongjian
    Xie, Zhi
    [J]. BRIEFINGS IN BIOINFORMATICS, 2021, 22 (03)
  • [50] Advancing next-generation sequencing data analytics with scalable distributed infrastructure
    Kim, Joohyun
    Maddineni, Sharath
    Jha, Shantenu
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2014, 26 (04): : 894 - 906