A study on fast calling variants from next-generation sequencing data using decision tree

被引:9
|
作者
Li, Zhentang [1 ,2 ]
Wang, Yi [3 ,4 ,5 ]
Wang, Fei [1 ,2 ]
机构
[1] Shanghai Key Lab Intelligent Informat Proc, Shanghai, Peoples R China
[2] Fudan Univ, Sch Comp Sci & Technol, Shanghai, Peoples R China
[3] Fudan Univ, MOE Key Lab Contemporary Anthropol, Shanghai 200438, Peoples R China
[4] Fudan Univ, Collaborat Innovat Ctr Genet & Dev Biol, State Key Lab Genet Engn, Shanghai 200438, Peoples R China
[5] Fudan Univ, Sch Life Sci, Shanghai 200438, Peoples R China
来源
BMC BIOINFORMATICS | 2018年 / 19卷
基金
中国国家自然科学基金;
关键词
Next-generation sequencing; Variant calling; Decision tree; FRAMEWORK; FORMAT;
D O I
10.1186/s12859-018-2147-9
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: The rapid development of next-generation sequencing (NGS) technology has continuously been refreshing the throughput of sequencing data. However, due to the lack of a smart tool that is both fast and accurate, the analysis task for NGS data, especially those with low coverage, remains challenging. Results: We proposed a decision-tree based variant calling algorithm. Experiments on a set of real data indicate that our algorithm achieves high accuracy and sensitivity for SNVs and indels and shows good adaptability on low-coverage data. In particular, our algorithm is obviously faster than 3 widely used tools in our experiments. Conclusions: We implemented our algorithm in a software named Fuwa and applied it together with 4 well-known variant callers, i.e., Platypus, GATK-UnifiedGenotyper, GATK-HaplotypeCaller and SAMtools, to three sequencing data sets of a well-studied sample NA12878, which were produced by whole-genome, whole-exome and low-coverage whole-genome sequencing technology respectively. We also conducted additional experiments on the WGS data of 4 newly released samples that have not been used to populate dbSNP.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] A study on fast calling variants from next-generation sequencing data using decision tree
    Zhentang Li
    Yi Wang
    Fei Wang
    BMC Bioinformatics, 19
  • [2] Genotype and SNP calling from next-generation sequencing data
    Rasmus Nielsen
    Joshua S. Paul
    Anders Albrechtsen
    Yun S. Song
    Nature Reviews Genetics, 2011, 12 : 443 - 451
  • [3] Genotype and SNP calling from next-generation sequencing data
    Nielsen, Rasmus
    Paul, Joshua S.
    Albrechtsen, Anders
    Song, Yun S.
    NATURE REVIEWS GENETICS, 2011, 12 (06) : 443 - 451
  • [4] Genotype calling from next-generation sequencing data using haplotype information of reads
    Zhi, Degui
    Wu, Jihua
    Liu, Nianjun
    Zhang, Kui
    BIOINFORMATICS, 2012, 28 (07) : 938 - 946
  • [5] Detection of genomic structural variants from next-generation sequencing data
    Tattini, Lorenzo
    D'Aurizio, Romina
    Magi, Alberto
    FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY, 2015, 3
  • [6] Review of alignment and SNP calling algorithms for next-generation sequencing data
    M. Mielczarek
    J. Szyda
    Journal of Applied Genetics, 2016, 57 : 71 - 79
  • [7] Review of alignment and SNP calling algorithms for next-generation sequencing data
    Mielczarek, M.
    Szyda, J.
    JOURNAL OF APPLIED GENETICS, 2016, 57 (01) : 71 - 79
  • [8] NDesign: software for study design for the detection of rare variants from next-generation sequencing data
    Yuki Sugaya
    Yasuaki Akazawa
    Akira Saito
    Shigeo Kamitsuji
    Journal of Human Genetics, 2012, 57 : 676 - 678
  • [9] NDesign: software for study design for the detection of rare variants from next-generation sequencing data
    Sugaya, Yuki
    Akazawa, Yasuaki
    Saito, Akira
    Kamitsuji, Shigeo
    JOURNAL OF HUMAN GENETICS, 2012, 57 (10) : 676 - 678
  • [10] A Bioinformatic Tool for Local Haplotyping of Deletion-Insertion Variants from Next-Generation Sequencing Data after Variant Calling
    Schmidt, Ryan J.
    Macleay, Allison
    Le, Long Phi
    JOURNAL OF MOLECULAR DIAGNOSTICS, 2019, 21 (03): : 384 - 389