An Integrated SNP Mining and Utilization (ISMU) Pipeline for Next Generation Sequencing Data

被引:11
|
作者
Azam, Sarwar [1 ]
Rathore, Abhishek [1 ]
Shah, Trushar M. [1 ]
Telluri, Mohan [1 ]
Amindala, BhanuPrakash [1 ]
Ruperao, Pradeep [1 ,2 ]
Katta, Mohan A. V. S. K. [1 ]
Varshney, Rajeev K. [1 ]
机构
[1] Int Crops Res Inst Semi Arid Trop, Ctr Excellence Genom, Patancheru, Andhra Pradesh, India
[2] Univ Queensland, Sch Agr & Food Sci, Brisbane, Qld, Australia
来源
PLOS ONE | 2014年 / 9卷 / 07期
关键词
CHICKPEA CICER-ARIETINUM; NUCLEOTIDE POLYMORPHISM DISCOVERY; READ ALIGNMENT; TRANSCRIPTOME; CROP; DIVERSITY; ANNOTATION; SOFTWARE; GENOTYPE; FORMAT;
D O I
10.1371/journal.pone.0101754
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Open source single nucleotide polymorphism (SNP) discovery pipelines for next generation sequencing data commonly requires working knowledge of command line interface, massive computational resources and expertise which is a daunting task for biologists. Further, the SNP information generated may not be readily used for downstream processes such as genotyping. Hence, a comprehensive pipeline has been developed by integrating several open source next generation sequencing (NGS) tools along with a graphical user interface called Integrated SNP Mining and Utilization (ISMU) for SNP discovery and their utilization by developing genotyping assays. The pipeline features functionalities such as pre-processing of raw data, integration of open source alignment tools (Bowtie2, BWA, Maq, NovoAlign and SOAP2), SNP prediction (SAMtools/SOAPsnp/CNS2snp and CbCC) methods and interfaces for developing genotyping assays. The pipeline outputs a list of high quality SNPs between all pairwise combinations of genotypes analyzed, in addition to the reference genome/sequence. Visualization tools (Tablet and Flapjack) integrated into the pipeline enable inspection of the alignment and errors, if any. The pipeline also provides a confidence score or polymorphism information content value with flanking sequences for identified SNPs in standard format required for developing marker genotyping (KASP and Golden Gate) assays. The pipeline enables users to process a range of NGS datasets such as whole genome re-sequencing, restriction site associated DNA sequencing and transcriptome sequencing data at a fast speed. The pipeline is very useful for plant genetics and breeding community with no computational expertise in order to discover SNPs and utilize in genomics, genetics and breeding studies. The pipeline has been parallelized to process huge datasets of next generation sequencing.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] SNPAAMapper: An efficient genome-wide SNP variant analysis pipeline for next-generation sequencing data
    Bai, Yongsheng
    Cavalcoli, James
    BIOINFORMATION, 2013, 9 (17) : 870 - 872
  • [2] An integrated pipeline for next-generation sequencing and annotation of mitochondrial genomes
    Jex, Aaron R.
    Hall, Ross S.
    Littlewood, D. Timothy J.
    Gasser, Robin B.
    NUCLEIC ACIDS RESEARCH, 2010, 38 (02) : 522 - 533
  • [3] Genotype and SNP calling from next-generation sequencing data
    Rasmus Nielsen
    Joshua S. Paul
    Anders Albrechtsen
    Yun S. Song
    Nature Reviews Genetics, 2011, 12 : 443 - 451
  • [4] Genotype and SNP calling from next-generation sequencing data
    Nielsen, Rasmus
    Paul, Joshua S.
    Albrechtsen, Anders
    Song, Yun S.
    NATURE REVIEWS GENETICS, 2011, 12 (06) : 443 - 451
  • [5] AI-enabled pipeline for virus detection, validation, and SNP discovery from next-generation sequencing data
    Ghorbani, Abozar
    Rostami, Mahsa
    Guzzi, Pietro Hiram
    FRONTIERS IN GENETICS, 2024, 15
  • [6] A Pipeline for the Development of Microsatellite Markers using Next Generation Sequencing Data
    Antunes, Adriana Maria
    Nunes Stival, Julio Gabriel
    Targueta, Cintia Pelegrineti
    de Campos Telles, Mariana Pires
    Soares, Thannya Nascimentos
    CURRENT GENOMICS, 2022, 23 (03) : 175 - 181
  • [7] A fast and accurate SNP detection algorithm for next-generation sequencing data
    Xu, Feng
    Wang, Weixin
    Wang, Panwen
    Li, Mulin Jun
    Sham, Pak Chung
    Wang, Junwen
    NATURE COMMUNICATIONS, 2012, 3
  • [8] Review of alignment and SNP calling algorithms for next-generation sequencing data
    M. Mielczarek
    J. Szyda
    Journal of Applied Genetics, 2016, 57 : 71 - 79
  • [9] A Bayesian Model for SNP Discovery Based on Next-Generation Sequencing Data
    Xu, Yanxun
    Zheng, Xiaofeng
    Yuan, Yuan
    Estecio, Marcos R.
    Issa, Jean-Pierre
    Ji, Yuan
    Liang, Shoudan
    2012 IEEE INTERNATIONAL WORKSHOP ON GENOMIC SIGNAL PROCESSING AND STATISTICS (GENSIPS), 2012, : 42 - 45
  • [10] Review of alignment and SNP calling algorithms for next-generation sequencing data
    Mielczarek, M.
    Szyda, J.
    JOURNAL OF APPLIED GENETICS, 2016, 57 (01) : 71 - 79