Next-generation sequencing revolution through big data analytics

被引:24
|
作者
Tripathi, Rashmi [1 ]
Sharma, Pawan [1 ]
Chakraborty, Pavan [2 ]
Varadwaj, Pritish Kumar [2 ]
机构
[1] Indian Inst Informat Technol Allahabad, Dept Bioinformat, Allahabad, Uttar Pradesh, India
[2] Indian Inst Informat Technol Allahabad, Dept Informat Technol, Allahabad, Uttar Pradesh, India
来源
FRONTIERS IN LIFE SCIENCE | 2016年 / 9卷 / 02期
关键词
Big data; cloud computing; Hadoop; next-generation sequencing; genomics; ANALYSIS TOOL; R PACKAGE; FRAMEWORK; HADOOP; CHIP; TRANSCRIPTION; GENOMES; WEB;
D O I
10.1080/21553769.2016.1178180
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Next-generation sequencing (NGS) technology has led to an unrivaled explosion in the amount of genomic data and this escalation has collaterally raised the challenges of sharing, archiving, integrating and analyzing these data. The scale and efficiency of NGS have posed a challenge for analysis of these vast genomic data, gene interactions, annotations and expression studies. However, this limitation of NGS can be safely overcome by tools and algorithms using big data framework. Based on this framework, here we have reviewed the current state of knowledge of big data algorithms for NGS to reveal hidden patterns in sequencing, analysis and annotation, and so on. The APACHE-based Hadoop framework gives an on-interest and adaptable environment for substantial scale data analysis. It has several components for partitioning of large-scale data onto clusters of commodity hardware, in a fault-tolerant manner. Packages like MapReduce, Cloudburst, Crossbow, Myrna, Eoulsan, DistMap, Seal and Contrail perform various NGS applications, such as adapter trimming, quality checking, read mapping, de novo assembly, quantification, expression analysis, variant analysis, and annotation. This review paper deals with the current applications of the Hadoop technology with their usage and limitations in perspective of NGS.
引用
收藏
页码:119 / 149
页数:31
相关论文
共 50 条
  • [31] Genotyping microsatellites in next-generation sequencing data
    Dashnow, Harriet
    Tan, Susan
    Das, Debjani
    Easteal, Simon
    Oshlack, Alicia
    [J]. BMC BIOINFORMATICS, 2015, 16
  • [32] Genotyping microsatellites in next-generation sequencing data
    Harriet Dashnow
    Susan Tan
    Debjani Das
    Simon Easteal
    Alicia Oshlack
    [J]. BMC Bioinformatics, 16
  • [33] Cancer research in the era of next-generation sequencing and big data calls for intelligent modeling
    Yli-Hietanen, Jari
    Ylipaa, Antti
    Yli-Harja, Olli
    [J]. CHINESE JOURNAL OF CANCER, 2015, 34
  • [34] Cancer research in the era of next-generation sequencing and big data calls for intelligent modeling
    Jari Yli-Hietanen
    Antti Ylip??
    Olli Yli-Harja
    [J]. 癌症, 2015, 34 (10) : 423 - 426
  • [35] Next-Generation Sequencing: Next-Generation Quality in Pediatrics
    Wortmann, Saskia B.
    Spenger, Johannes
    Preisel, Martin
    Koch, Johannes
    Rauscher, Christian
    Bader, Ingrid
    Mayr, Johannes A.
    Sperl, Wolfgang
    [J]. PADIATRIE UND PADOLOGIE, 2018, 53 (06): : 278 - 283
  • [36] Next-generation sequencing for next-generation breeding, and more
    Tsai, Chung-Jui
    [J]. NEW PHYTOLOGIST, 2013, 198 (03) : 635 - 637
  • [37] Next-Generation Sequencing Demands Next-Generation Phenotyping
    Hennekam, Raoul C. M.
    Biesecker, Leslie G.
    [J]. HUMAN MUTATION, 2012, 33 (05) : 884 - 886
  • [38] The Molecular Revolution in Cutaneous Biology: Era of Next-Generation Sequencing
    Sarig, Ofer
    Sprecher, Eli
    [J]. JOURNAL OF INVESTIGATIVE DERMATOLOGY, 2017, 137 (05) : E79 - E82
  • [39] Next-generation sequencing
    Haferlach, T.
    [J]. ONCOLOGY RESEARCH AND TREATMENT, 2016, 39 : 40 - 41
  • [40] Next-Generation Sequencing
    Xiong, Momiao
    Zhao, Zhongming
    Arnold, Jonathan
    Yu, Fuli
    [J]. JOURNAL OF BIOMEDICINE AND BIOTECHNOLOGY, 2010,