An Integrated SNP Mining and Utilization (ISMU) Pipeline for Next Generation Sequencing Data

被引:11
|
作者
Azam, Sarwar [1 ]
Rathore, Abhishek [1 ]
Shah, Trushar M. [1 ]
Telluri, Mohan [1 ]
Amindala, BhanuPrakash [1 ]
Ruperao, Pradeep [1 ,2 ]
Katta, Mohan A. V. S. K. [1 ]
Varshney, Rajeev K. [1 ]
机构
[1] Int Crops Res Inst Semi Arid Trop, Ctr Excellence Genom, Patancheru, Andhra Pradesh, India
[2] Univ Queensland, Sch Agr & Food Sci, Brisbane, Qld, Australia
来源
PLOS ONE | 2014年 / 9卷 / 07期
关键词
CHICKPEA CICER-ARIETINUM; NUCLEOTIDE POLYMORPHISM DISCOVERY; READ ALIGNMENT; TRANSCRIPTOME; CROP; DIVERSITY; ANNOTATION; SOFTWARE; GENOTYPE; FORMAT;
D O I
10.1371/journal.pone.0101754
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Open source single nucleotide polymorphism (SNP) discovery pipelines for next generation sequencing data commonly requires working knowledge of command line interface, massive computational resources and expertise which is a daunting task for biologists. Further, the SNP information generated may not be readily used for downstream processes such as genotyping. Hence, a comprehensive pipeline has been developed by integrating several open source next generation sequencing (NGS) tools along with a graphical user interface called Integrated SNP Mining and Utilization (ISMU) for SNP discovery and their utilization by developing genotyping assays. The pipeline features functionalities such as pre-processing of raw data, integration of open source alignment tools (Bowtie2, BWA, Maq, NovoAlign and SOAP2), SNP prediction (SAMtools/SOAPsnp/CNS2snp and CbCC) methods and interfaces for developing genotyping assays. The pipeline outputs a list of high quality SNPs between all pairwise combinations of genotypes analyzed, in addition to the reference genome/sequence. Visualization tools (Tablet and Flapjack) integrated into the pipeline enable inspection of the alignment and errors, if any. The pipeline also provides a confidence score or polymorphism information content value with flanking sequences for identified SNPs in standard format required for developing marker genotyping (KASP and Golden Gate) assays. The pipeline enables users to process a range of NGS datasets such as whole genome re-sequencing, restriction site associated DNA sequencing and transcriptome sequencing data at a fast speed. The pipeline is very useful for plant genetics and breeding community with no computational expertise in order to discover SNPs and utilize in genomics, genetics and breeding studies. The pipeline has been parallelized to process huge datasets of next generation sequencing.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] Effusion Fluids: Utilization of Next-Generation Sequencing
    Quintana, Liza
    Wu, Yubo
    VanderLaan, Paul
    Chen, Athena
    MODERN PATHOLOGY, 2020, 33 (SUPPL 2) : 408 - 408
  • [42] Single-sample SNP detection by empirical Bayes method using next generation sequencing data
    Ding, Weijie
    Kou, Qiang
    Wang, Xueqin
    Xu, Qiuya
    You, Na
    STATISTICS AND ITS INTERFACE, 2015, 8 (04) : 457 - 462
  • [43] Rapid Detection of Rare Deleterious Variants by Next Generation Sequencing with Optional Microarray SNP Genotype Data
    Watson, Christopher M.
    Crinnion, Laura A.
    Gurgel-Gianetti, Juliana
    Harrison, Sally M.
    Daly, Catherine
    Antanavicuite, Agne
    Lascelles, Carolina
    Markham, Alexander F.
    Pena, Sergio D. J.
    Bonthron, David T.
    Carr, Ian M.
    HUMAN MUTATION, 2015, 36 (09) : 823 - 830
  • [44] Copy number variation analysis in neuroblastoma through next generation sequencing data and SNP-microarray
    Fransson, Susanne
    Ostenssson, Malin
    Djos, Anna
    Javanmardi, Niloufar
    Kogner, Per
    Martinsson, Tommy
    CANCER RESEARCH, 2014, 74 (19)
  • [45] A 135-mW Fully Integrated Data Processor for Next-Generation Sequencing
    Wu, Yi-Chung
    Chang, Chia-Hua
    Hung, Jui-Hung
    Yang, Chia-Hsiang
    IEEE TRANSACTIONS ON BIOMEDICAL CIRCUITS AND SYSTEMS, 2017, 11 (06) : 1216 - 1225
  • [46] A 135mW Fully Integrated Data Processor for Next-Generation Sequencing
    Wu, Yi-Chung
    Hung, Jui-Hung
    Yang, Chia-Hsiang
    2017 IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE (ISSCC), 2017, : 252 - 252
  • [47] sRNAminer: A multifunctional toolkit for next-generation sequencing small RNA data mining in plants
    Li, Guanliang
    Chen, Chengjie
    Chen, Peike
    Meyers, Blake C.
    Xia, Rui
    SCIENCE BULLETIN, 2024, 69 (06) : 784 - 791
  • [48] Mining and Development of Novel SSR Markers Using Next Generation Sequencing (NGS) Data in Plants
    Taheri, Sima
    Abdullah, Thohirah Lee
    Yusop, Mohd Rafii
    Hanafi, Mohamed Musa
    Sahebi, Mahbod
    Azizi, Parisa
    Shamshiri, Redmond Ramin
    MOLECULES, 2018, 23 (02):
  • [49] NGSNGS: next-generation simulator for next-generation sequencing data
    Henriksen, Rasmus Amund
    Zhao, Lei
    Korneliussen, Thorfinn Sand
    BIOINFORMATICS, 2023, 39 (01)
  • [50] Erratum to: A novel procedure on next generation sequencing data analysis using text mining algorithm
    Weizhong Zhao
    James J. Chen
    Roger Perkins
    Yuping Wang
    Zhichao Liu
    Huixiao Hong
    Weida Tong
    Wen Zou
    BMC Bioinformatics, 17