An Integrated SNP Mining and Utilization (ISMU) Pipeline for Next Generation Sequencing Data

被引:11
|
作者
Azam, Sarwar [1 ]
Rathore, Abhishek [1 ]
Shah, Trushar M. [1 ]
Telluri, Mohan [1 ]
Amindala, BhanuPrakash [1 ]
Ruperao, Pradeep [1 ,2 ]
Katta, Mohan A. V. S. K. [1 ]
Varshney, Rajeev K. [1 ]
机构
[1] Int Crops Res Inst Semi Arid Trop, Ctr Excellence Genom, Patancheru, Andhra Pradesh, India
[2] Univ Queensland, Sch Agr & Food Sci, Brisbane, Qld, Australia
来源
PLOS ONE | 2014年 / 9卷 / 07期
关键词
CHICKPEA CICER-ARIETINUM; NUCLEOTIDE POLYMORPHISM DISCOVERY; READ ALIGNMENT; TRANSCRIPTOME; CROP; DIVERSITY; ANNOTATION; SOFTWARE; GENOTYPE; FORMAT;
D O I
10.1371/journal.pone.0101754
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Open source single nucleotide polymorphism (SNP) discovery pipelines for next generation sequencing data commonly requires working knowledge of command line interface, massive computational resources and expertise which is a daunting task for biologists. Further, the SNP information generated may not be readily used for downstream processes such as genotyping. Hence, a comprehensive pipeline has been developed by integrating several open source next generation sequencing (NGS) tools along with a graphical user interface called Integrated SNP Mining and Utilization (ISMU) for SNP discovery and their utilization by developing genotyping assays. The pipeline features functionalities such as pre-processing of raw data, integration of open source alignment tools (Bowtie2, BWA, Maq, NovoAlign and SOAP2), SNP prediction (SAMtools/SOAPsnp/CNS2snp and CbCC) methods and interfaces for developing genotyping assays. The pipeline outputs a list of high quality SNPs between all pairwise combinations of genotypes analyzed, in addition to the reference genome/sequence. Visualization tools (Tablet and Flapjack) integrated into the pipeline enable inspection of the alignment and errors, if any. The pipeline also provides a confidence score or polymorphism information content value with flanking sequences for identified SNPs in standard format required for developing marker genotyping (KASP and Golden Gate) assays. The pipeline enables users to process a range of NGS datasets such as whole genome re-sequencing, restriction site associated DNA sequencing and transcriptome sequencing data at a fast speed. The pipeline is very useful for plant genetics and breeding community with no computational expertise in order to discover SNPs and utilize in genomics, genetics and breeding studies. The pipeline has been parallelized to process huge datasets of next generation sequencing.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] BING: Biomedical informatics pipeline for Next Generation Sequencing
    Kriseman, Jeffrey
    Busick, Christopher
    Szelinger, Szabolcs
    Dinu, Valentin
    JOURNAL OF BIOMEDICAL INFORMATICS, 2010, 43 (03) : 428 - 434
  • [22] Statistical Method for Next Generation Sequencing Pipeline Comparisons
    Leblay, N.
    Elsensohn, M. -H.
    Dimassl, S.
    Campan-Fournier, A.
    Labalme, A.
    Sanlaville, D.
    Lesca, G.
    Barde, C.
    Roy, P.
    HUMAN HEREDITY, 2015, 79 (01) : 42 - 42
  • [23] A Highly Parallel Next-Generation DNA Sequencing Data Analysis Pipeline in Hadoop
    Aggour, Kareem S.
    Kumar, Vijay S.
    Sangurdekar, Dipen P.
    Newberg, Lee A.
    Kodira, Chinnappa D.
    PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2015, : 756 - 763
  • [24] MHTyper: a microhaplotype allele-calling pipeline for use with next generation sequencing data
    Zhang, Chi
    Cao, Yan-Dong
    Song, Jiao-Jiao
    Rao, Min
    Nie, Sheng-Jie
    Zhang, Guang-Feng
    Kang, Ke-Lai
    Ji, An-Quan
    Ye, Jian
    Wang, Le
    AUSTRALIAN JOURNAL OF FORENSIC SCIENCES, 2021, 53 (03) : 283 - 290
  • [25] InDelGT: An integrated pipeline for extracting indel genotypes for genetic mapping in a hybrid population using next-generation sequencing data
    Pan, Zhiliang
    Zhang, Jinpeng
    Bai, Shengjun
    Li, Zhiting
    Tong, Chunfa
    APPLICATIONS IN PLANT SCIENCES, 2022, 10 (06):
  • [26] SNP discovery in apple cultivars using next generation sequencing
    Sérgio Alencar
    Orzenil Silva-Junior
    Roberto Togawa
    Marcos Costa
    Luís Fernando Revers
    Georgios Pappas
    BMC Proceedings, 5 (Suppl 7)
  • [27] Performance of a next generation sequencing SNP assay on degraded DNA
    Gettings, Katherine Butler
    Kiesler, Kevin M.
    Vallone, Peter M.
    FORENSIC SCIENCE INTERNATIONAL-GENETICS, 2015, 19 : 1 - 9
  • [28] Ub-ISAP: a streamlined UNIX pipeline for mining unique viral vector integration sites from next generation sequencing data
    Kamboj, Atul
    Hallwirth, Claus V.
    Alexander, Ian E.
    McCowage, Geoffrey B.
    Kramer, Belinda
    BMC BIOINFORMATICS, 2017, 18
  • [29] Ub-ISAP: a streamlined UNIX pipeline for mining unique viral vector integration sites from next generation sequencing data
    Atul Kamboj
    Claus V. Hallwirth
    Ian E. Alexander
    Geoffrey B. McCowage
    Belinda Kramer
    BMC Bioinformatics, 18
  • [30] Archiving next generation sequencing data
    Shumway, Martin
    Cochrane, Guy
    Sugawara, Hideaki
    NUCLEIC ACIDS RESEARCH, 2010, 38 : D870 - D871