Next-Generation Sequencing Data Analysis on Pool-Seq and Low-Coverage Retinoblastoma Data

被引:1
|
作者
Ozdemir Ozdogan, Gulistan [1 ]
Kaya, Hilal [1 ]
机构
[1] Ankara Yildirim Beyazit Univ, Dept Comp Engn, TR-06010 Ankara, Turkey
关键词
Low-coverage sequencing; NGS data analysis; Pool-seq; Retinoblastoma; READ ALIGNMENT; VARIANTS;
D O I
10.1007/s12539-020-00374-8
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Next-generation sequencing (NGS) is related to massively parallel or deep deoxyribonucleic acid (DNA) sequencing technology which has revolutionized genomic researches in recent years. Although the cost of generating NGS data was decreased compared to the one at the time of emerging this technology, its cost might still be somewhat a problem. Hence, new strategies as pool-seq and low-coverage NGS data have been developed to overcome the cost problem. Despite decreasing cost, it is important to elucidate whether they are efficient in NGS studies. We applied a bioinformatics pipeline on pool-seq and low-coverage retinoblastoma data retrieved from only tumor data. Retinoblastoma is an eye malignancy in childhood that is initiated by RB1 mutation or MYCN amplification and can lead to the loss of vision of eye(s), and even sometimes life. We applied our pipeline on both retinoblastoma disease data and two other particular data to testify the validity and also for comparison purposes in the aspect of performance. High-confidence variant calls from Genome in a Bottle Consortium were used for fulfilling these purposes. We observed that our pipeline successfully called higher number of variants than a standard pipeline for all these three different data. Besides, the recall and F-score values were quite better in our pipeline as being noteworthy. We further presented our results on disease data in the aspects of the variants, variant types and disease-related genes. This study provides a guideline for performing NGS data analysis pipeline on pool-seq and low-coverage sequencing data in conjunction. To get more conclusive outcomes of these two strategies, we recommend using cancer data having higher mutation rates and larger pools.
引用
收藏
页码:302 / 310
页数:9
相关论文
共 50 条
  • [11] NGSNGS: next-generation simulator for next-generation sequencing data
    Henriksen, Rasmus Amund
    Zhao, Lei
    Korneliussen, Thorfinn Sand
    BIOINFORMATICS, 2023, 39 (01)
  • [12] Next-generation sequencing data analysis on cloud computing
    Kwon, Taesoo
    Yoo, Won Gi
    Lee, Won-Ja
    Kim, Won
    Kim, Dae-Won
    GENES & GENOMICS, 2015, 37 (06) : 489 - 501
  • [13] Extending KNIME for next-generation sequencing data analysis
    Jagla, Bernd
    Wiswedel, Bernd
    Coppee, Jean-Yves
    BIOINFORMATICS, 2011, 27 (20) : 2907 - 2909
  • [14] Next-generation sequencing data analysis on cloud computing
    Taesoo Kwon
    Won Gi Yoo
    Won-Ja Lee
    Won Kim
    Dae-Won Kim
    Genes & Genomics, 2015, 37 : 489 - 501
  • [15] Indexing Next-Generation Sequencing data
    Jalili, Vahid
    Matteucci, Matteo
    Masseroli, Marco
    Ceri, Stefano
    INFORMATION SCIENCES, 2017, 384 : 90 - 109
  • [16] Quantifying Selection with Pool-Seq Time Series Data
    Taus, Thomas
    Futschik, Andreas
    Schloetterer, Christian
    MOLECULAR BIOLOGY AND EVOLUTION, 2017, 34 (11) : 3023 - 3034
  • [17] Measuring Genetic Differentiation from Pool-seq Data
    Hivert, Valentin
    Leblois, Raphael
    Petit, Eric J.
    Gautier, Mathieu
    Vitalis, Renaud
    GENETICS, 2018, 210 (01) : 315 - 330
  • [18] Benchmarking the performance of Pool-seq SNP callers using simulated and real sequencing data
    Guirao-Rico, Sara
    Gonzalez, Josefa
    MOLECULAR ECOLOGY RESOURCES, 2021, 21 (04) : 1216 - 1229
  • [19] Seq2pathway: an R/Bioconductor package for pathway analysis of next-generation sequencing data
    Wang, Bin
    Cunningham, John M.
    Yang, Xinan
    BIOINFORMATICS, 2015, 31 (18) : 3043 - 3045
  • [20] A Review on The Processing and Analysis of Next-generation RNA-seq Data
    Wang Xi
    Wang Xiao-Wo
    Wang Li-Kun
    Feng Zhi-Xing
    Zhang Xue-Gong
    PROGRESS IN BIOCHEMISTRY AND BIOPHYSICS, 2010, 37 (08) : 834 - 846