MethyQA: a pipeline for bisulfite-treated methylation sequencing quality assessment
被引:14
|
作者:
Sun, Shuying
论文数: 0引用数: 0
h-index: 0
机构:
Case Western Reserve Univ, Dept Epidemiol & Biostat, Cleveland, OH 44106 USA
Texas State Univ, Dept Math, San Marcos, TX 78666 USACase Western Reserve Univ, Dept Epidemiol & Biostat, Cleveland, OH 44106 USA
Sun, Shuying
[1
,2
]
Noviski, Aaron
论文数: 0引用数: 0
h-index: 0
机构:
Case Western Reserve Univ, Dept Elect Engn & Comp Sci, Cleveland, OH 44106 USACase Western Reserve Univ, Dept Epidemiol & Biostat, Cleveland, OH 44106 USA
Noviski, Aaron
[3
]
Yu, Xiaoqing
论文数: 0引用数: 0
h-index: 0
机构:
Case Western Reserve Univ, Dept Epidemiol & Biostat, Cleveland, OH 44106 USACase Western Reserve Univ, Dept Epidemiol & Biostat, Cleveland, OH 44106 USA
Yu, Xiaoqing
[1
]
机构:
[1] Case Western Reserve Univ, Dept Epidemiol & Biostat, Cleveland, OH 44106 USA
[2] Texas State Univ, Dept Math, San Marcos, TX 78666 USA
[3] Case Western Reserve Univ, Dept Elect Engn & Comp Sci, Cleveland, OH 44106 USA
DNA methylation;
Next generation sequencing;
Alignment;
BRAT;
Quality assessment;
DNA METHYLATION;
BREAST-CANCER;
CPG ISLANDS;
HYPERMETHYLATION;
PLURIPOTENT;
EFFICIENT;
ALIGNMENT;
MARKERS;
COLON;
MAPS;
D O I:
10.1186/1471-2105-14-259
中图分类号:
Q5 [生物化学];
学科分类号:
071010 ;
081704 ;
摘要:
Background: DNA methylation is an epigenetic event that adds a methyl-group to the 5' cytosine. This epigenetic modification can significantly affect gene expression in both normal and diseased cells. Hence, it is important to study methylation signals at the single cytosine site level, which is now possible utilizing bisulfite conversion technique (i.e., converting unmethylated Cs to Us and then to Ts after PCR amplification) and next generation sequencing (NGS) technologies. Despite the advances of NGS technologies, certain quality issues remain. Some of the more prevalent quality issues involve low per-base sequencing quality at the 3' end, PCR amplification bias, and bisulfite conversion rates. Therefore, it is important to conduct quality assessment before downstream analysis. To the best of our knowledge, no existing software packages can generally assess the quality of methylation sequencing data generated based on different bisulfite-treated protocols. Results: To conduct the quality assessment of bisulfite methylation sequencing data, we have developed a pipeline named MethyQA. MethyQA combines currently available open-source software packages with our own custom programs written in Perl and R. The pipeline can provide quality assessment results for tens of millions of reads in under an hour. The novelty of our pipeline lies in its examination of bisulfite conversion rates and of the DNA sequence structure of regions that have different conversion rates or coverage. Conclusions: MethyQA is a new software package that provides users with a unique insight into the methylation sequencing data they are researching. It allows the users to determine the quality of their data and better prepares them to address the research questions that lie ahead. Due to the speed and efficiency at which MethyQA operates, it will become an important tool for studies dealing with bisulfite methylation sequencing data.
机构:
Univ Calif San Diego, Dept Bioengn, La Jolla, CA 92093 USA
Univ Calif San Diego, Bioinformat & Syst Biol Grad Program, La Jolla, CA 92093 USAUniv Calif San Diego, Dept Bioengn, La Jolla, CA 92093 USA
Diep, Dinh
Plongthongkum, Nongluk
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif San Diego, Dept Bioengn, La Jolla, CA 92093 USAUniv Calif San Diego, Dept Bioengn, La Jolla, CA 92093 USA
Plongthongkum, Nongluk
Gore, Athurva
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif San Diego, Dept Bioengn, La Jolla, CA 92093 USAUniv Calif San Diego, Dept Bioengn, La Jolla, CA 92093 USA
Gore, Athurva
Fung, Ho-Lim
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif San Diego, Dept Bioengn, La Jolla, CA 92093 USAUniv Calif San Diego, Dept Bioengn, La Jolla, CA 92093 USA
Fung, Ho-Lim
Shoemaker, Robert
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif San Diego, Dept Chem & Biochem, La Jolla, CA 92093 USAUniv Calif San Diego, Dept Bioengn, La Jolla, CA 92093 USA
Shoemaker, Robert
Zhang, Kun
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif San Diego, Dept Bioengn, La Jolla, CA 92093 USA
Univ Calif San Diego, Bioinformat & Syst Biol Grad Program, La Jolla, CA 92093 USAUniv Calif San Diego, Dept Bioengn, La Jolla, CA 92093 USA
机构:
Korea Res Inst Biosci & Biotechnol, Dev & Differentiat Res Ctr, Daejeon, South Korea
Korea Univ Sci & Technol, Dept Funct Genom, Daejeon, South KoreaKorea Res Inst Biosci & Biotechnol, Dev & Differentiat Res Ctr, Daejeon, South Korea
Jeon, Kyuheum
Min, Byungkuk
论文数: 0引用数: 0
h-index: 0
机构:
Korea Res Inst Biosci & Biotechnol, Dev & Differentiat Res Ctr, Daejeon, South KoreaKorea Res Inst Biosci & Biotechnol, Dev & Differentiat Res Ctr, Daejeon, South Korea
Min, Byungkuk
Park, Jung S.
论文数: 0引用数: 0
h-index: 0
机构:
Korea Res Inst Biosci & Biotechnol, Dev & Differentiat Res Ctr, Daejeon, South KoreaKorea Res Inst Biosci & Biotechnol, Dev & Differentiat Res Ctr, Daejeon, South Korea
Park, Jung S.
Kang, Yong-Kook
论文数: 0引用数: 0
h-index: 0
机构:
Korea Res Inst Biosci & Biotechnol, Dev & Differentiat Res Ctr, Daejeon, South Korea
Korea Univ Sci & Technol, Dept Funct Genom, Daejeon, South KoreaKorea Res Inst Biosci & Biotechnol, Dev & Differentiat Res Ctr, Daejeon, South Korea