AKSmooth: Enhancing low-coverage bisulfite sequencing data via kernel-based smoothing

被引:0
|
作者
Chen, Junfang [1 ,2 ]
Lutsik, Pavlo [2 ]
Akulenko, Ruslan [1 ]
Walter, Joern [2 ]
Helms, Volkhard [1 ]
机构
[1] Univ Saarland, Ctr Bioinformat, D-66123 Saarbrucken, Germany
[2] Univ Saarland, Dept Genet, D-66123 Saarbrucken, Germany
关键词
DNA methylation; whole-genome bisulfite sequencing; read coverage; DNA METHYLATION; CANCER EPIGENOMICS; 5-METHYLCYTOSINE; IDENTIFICATION; PIPELINE; CELLS;
D O I
10.1142/S0219720014420050
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Whole-genome bisulfite sequencing (WGBS) is an approach of growing importance. It is the only approach that provides a comprehensive picture of the genome-wide DNA methylation pro file. However, obtaining a sufficient amount of genome and read coverage typically requires high sequencing costs. Bioinformatics tools can reduce this cost burden by improving the quality of sequencing data. We have developed a statistical method Ajusted Local Kernel Smoother (AKSmooth) that can accurately and efficiently reconstruct the single CpG methylation estimate across the entire methylome using low-coverage bisulfite sequencing (Bi-Seq) data. We demonstrate the AKSmooth performance on the low-coverage (similar to 4x) DNA methylation profiles of three human colon cancer samples and matched controls. Under the best set of parameters, AKSmooth-curated data showed high concordance with the gold standard high-coverage sample (Pearson 0.90), outperforming the popular analogous method. In addition, AKSmooth showed computational efficiency with runtime benchmark over 4.5 times better than the reference tool. To summarize, AKSmooth is a simple and efficient tool that can provide an accurate human colon methylome estimation profile from low-coverage WGBS data. The proposed method is implemented in R and is available at https:// github.com/Junfang/AKSmooth.
引用
下载
收藏
页数:17
相关论文
共 50 条
  • [1] Kinship Estimation Based on Extremely Low-Coverage Sequencing Data
    Dou, Jinzhuang
    Chothani, Sonia
    Sim, Xueling
    Hughes, Jason D.
    Reilly, Dermot F.
    Tai, E. Shyong
    Liu, Jianjun
    Wang, Chaolong
    GENETIC EPIDEMIOLOGY, 2016, 40 (07) : 619 - 620
  • [2] Improved computations for relationship inference using low-coverage sequencing data
    Mostad, Petter
    Tillmar, Andreas
    Kling, Daniel
    BMC BIOINFORMATICS, 2023, 24 (01)
  • [3] Improved computations for relationship inference using low-coverage sequencing data
    Petter Mostad
    Andreas Tillmar
    Daniel Kling
    BMC Bioinformatics, 24
  • [4] Estimating microhaplotype allele frequencies from low-coverage or pooled sequencing data
    Thomas A. Delomas
    Stuart C. Willis
    BMC Bioinformatics, 24
  • [5] Estimating microhaplotype allele frequencies from low-coverage or pooled sequencing data
    Delomas, Thomas A.
    Willis, Stuart C.
    BMC BIOINFORMATICS, 2023, 24 (01)
  • [6] Comparing a few SNP calling algorithms using low-coverage sequencing data
    Xiaoqing Yu
    Shuying Sun
    BMC Bioinformatics, 14
  • [7] SNP genotyping and parameter estimation in polyploids using low-coverage sequencing data
    Blischak, Paul D.
    Kubatko, Laura S.
    Wolfe, Andrea D.
    BIOINFORMATICS, 2018, 34 (03) : 407 - 415
  • [8] Characterizing Bias in Population Genetic Inferences from Low-Coverage Sequencing Data
    Han, Eunjung
    Sinsheimer, Janet S.
    Novembre, John
    MOLECULAR BIOLOGY AND EVOLUTION, 2014, 31 (03) : 723 - 735
  • [9] Comparing a few SNP calling algorithms using low-coverage sequencing data
    Yu, Xiaoqing
    Sun, Shuying
    BMC BIOINFORMATICS, 2013, 14
  • [10] A comparison of existing global DNA methylation assays to low-coverage whole-genome bisulfite sequencing for epidemiological studies
    Crary-Dooley, Florence K.
    Tam, Mitchell E.
    Dunaway, Keith W.
    Hertz-Picciotto, Irva
    Schmidt, Rebecca J.
    LaSalle, Janine M.
    EPIGENETICS, 2017, 12 (03) : 206 - 214