DDBJ Read Annotation Pipeline: A Cloud Computing-Based Pipeline for High-Throughput Analysis of Next-Generation Sequencing Data

被引:47
|
作者
Nagasaki, Hideki [1 ,2 ]
Mochizuki, Takako [1 ,2 ]
Kodama, Yuichi [1 ,2 ]
Saruhashi, Satoshi [1 ,2 ]
Morizaki, Shota [3 ]
Sugawara, Hideaki [1 ,2 ]
Ohyanagi, Hajime [4 ]
Kurata, Nori [4 ]
Okubo, Kousaku [1 ,2 ,5 ]
Takagi, Toshihisa [1 ,2 ,5 ]
Kaminuma, Eli [1 ,2 ]
Nakamura, Yasukazu [1 ,2 ]
机构
[1] Natl Inst Genet, Ctr Informat Biol, Mishima, Shizuoka 4118510, Japan
[2] Natl Inst Genet, DNA Data Bank Japan, Mishima, Shizuoka 4118510, Japan
[3] Fujisoft Inc, Chiyoda Ku, Tokyo 1010022, Japan
[4] Natl Inst Genet, Plant Genet Lab, Mishima, Shizuoka 4118510, Japan
[5] Database Ctr Life Sci, Bunkyo Ku, Tokyo 1130032, Japan
关键词
next-generation sequencing; sequence read archive; cloud computing; analytical pipeline; genome analysis; BURROWS-WHEELER TRANSFORM; RNA-SEQ DATA; GENOME SEQUENCE; ALIGNMENT; ULTRAFAST; ASSEMBLER; VARIANTS; ARCHIVE; BIOLOGY; FORMAT;
D O I
10.1093/dnares/dst017
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
High-performance next-generation sequencing (NGS) technologies are advancing genomics and molecular biological research. However, the immense amount of sequence data requires computational skills and suitable hardware resources that are a challenge to molecular biologists. The DNA Data Bank of Japan (DDBJ) of the National Institute of Genetics (NIG) has initiated a cloud computing-based analytical pipeline, the DDBJ Read Annotation Pipeline (DDBJ Pipeline), for a high-throughput annotation of NGS reads. The DDBJ Pipeline offers a user-friendly graphical web interface and processes massive NGS datasets using decentralized processing by NIG supercomputers currently free of charge. The proposed pipeline consists of two analysis components: basic analysis for reference genome mapping and de novo assembly and subsequent high-level analysis of structural and functional annotations. Users may smoothly switch between the two components in the pipeline, facilitating web-based operations on a supercomputer for high-throughput data analysis. Moreover, public NGS reads of the DDBJ Sequence Read Archive located on the same supercomputer can be imported into the pipeline through the input of only an accession number. This proposed pipeline will facilitate research by utilizing unified analytical workflows applied to the NGS data. The DDBJ Pipeline is accessible at http://p.ddbj.nig.ac.jp/.
引用
收藏
页码:383 / 390
页数:8
相关论文
共 50 条
  • [31] Eoulsan: a cloud computing-based framework facilitating high throughput sequencing analyses
    Jourdren, Laurent
    Bernard, Maria
    Dillies, Marie-Agnes
    Le Crom, Stephane
    BIOINFORMATICS, 2012, 28 (11) : 1542 - 1543
  • [32] Y-LineageTracker: a high-throughput analysis framework for Y-chromosomal next-generation sequencing data
    Hao Chen
    Yan Lu
    Dongsheng Lu
    Shuhua Xu
    BMC Bioinformatics, 22
  • [33] Y-LineageTracker: a high-throughput analysis framework for Y-chromosomal next-generation sequencing data
    Chen, Hao
    Lu, Yan
    Lu, Dongsheng
    Xu, Shuhua
    BMC BIOINFORMATICS, 2021, 22 (01)
  • [34] HIGH-THROUGHPUT SEQUENCING USING NEXT-GENERATION SEQUENCING WITHIN 72 H.
    Sheh, Alexander
    Lebedeva, Tatiana V.
    Yu, Neng
    HUMAN IMMUNOLOGY, 2015, 76 : 10 - 10
  • [35] Data Analysis Pipeline for Cytomegalovirus Drug Resistance Genotyping by Next-Generation Sequencing: Fastq in Report out
    Sahoo, M. K.
    Lefferova, M. I.
    Waggoner, J. J.
    Pinsky, B. A.
    JOURNAL OF MOLECULAR DIAGNOSTICS, 2015, 17 (06): : 806 - 806
  • [36] SNPAAMapper: An efficient genome-wide SNP variant analysis pipeline for next-generation sequencing data
    Bai, Yongsheng
    Cavalcoli, James
    BIOINFORMATION, 2013, 9 (17) : 870 - 872
  • [37] High-throughput mutational analysis in cell-free DNA by targeted next-generation sequencing
    Tam, Nga Wan Rachel
    Shames, David
    Darbonne, Walter
    CANCER RESEARCH, 2016, 76
  • [38] A high-throughput next-generation sequencing-based method for detecting the mutational fingerprint of carcinogens
    Besaratinia, Ahmad
    Li, Haiqing
    Yoon, Jae-In
    Zheng, Albert
    Gao, Hanlin
    Tommasi, Stella
    NUCLEIC ACIDS RESEARCH, 2012, 40 (15)
  • [39] High-Throughput Measurement of Binding Kinetics by mRNA Display and Next-Generation Sequencing
    Jalali-Yazdi, Farzad
    Lai, Lan Huong
    Takahashi, Terry T.
    Roberts, Richard W.
    ANGEWANDTE CHEMIE-INTERNATIONAL EDITION, 2016, 55 (12) : 4007 - 4010
  • [40] Screening for chronic prostatitis pathogens using high-throughput next-generation sequencing
    Wu, Yi
    Jiang, Haiyang
    Tan, Mingbo
    Lu, Xuedong
    PROSTATE, 2020, 80 (07): : 577 - 587