A survey of tools for variant analysis of next-generation genome sequencing data

被引:352
|
作者
Pabinger, Stephan [1 ]
Dander, Andreas [2 ,3 ]
Fischer, Maria [1 ]
Snajder, Rene [1 ,2 ]
Sperk, Michael [3 ]
Efremova, Mirjana [3 ]
Krabichler, Birgit [4 ]
Speicher, Michael R. [5 ]
Zschocke, Johannes [4 ]
Trajanoski, Zlatko [1 ]
机构
[1] Med Univ Innsbruck, Div Bioinformat, A-6020 Innsbruck, Austria
[2] Oncotyrol, Innsbruck, Austria
[3] Med Univ Innsbruck, A-6020 Innsbruck, Austria
[4] Med Univ Innsbruck, Div Human Genet, A-6020 Innsbruck, Austria
[5] Med Univ Graz, Inst Human Genet, Graz, Austria
基金
奥地利科学基金会;
关键词
Mendelian disorders; cancer; variants; bioinformatics tools; next-generation sequencing; COPY-NUMBER VARIATION; WHOLE-GENOME; QUALITY-CONTROL; READ ALIGNMENT; FUNCTIONAL-CHARACTERIZATION; STRUCTURAL VARIATION; MENDELIAN DISEASE; POINT MUTATIONS; EXOME; CANCER;
D O I
10.1093/bib/bbs086
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Recent advances in genome sequencing technologies provide unprecedented opportunities to characterize individual genomic landscapes and identify mutations relevant for diagnosis and therapy. Specifically, whole-exome sequencing using next-generation sequencing (NGS) technologies is gaining popularity in the human genetics community due to the moderate costs, manageable data amounts and straightforward interpretation of analysis results. While whole-exome and, in the near future, whole-genome sequencing are becoming commodities, data analysis still poses significant challenges and led to the development of a plethora of tools supporting specific parts of the analysis workflow or providing a complete solution. Here, we surveyed 205 tools for whole-genome/whole-exome sequencing data analysis supporting five distinct analytical steps: quality assessment, alignment, variant identification, variant annotation and visualization. We report an overview of the functionality, features and specific requirements of the individual tools. We then selected 32 programs for variant identification, variant annotation and visualization, which were subjected to hands-on evaluation using four data sets: one set of exome data from two patients with a rare disease for testing identification of germline mutations, two cancer data sets for testing variant callers for somatic mutations, copy number variations and structural variations, and one semi-synthetic data set for testing identification of copy number variations. Our comprehensive survey and evaluation of NGS tools provides a valuable guideline for human geneticists working on Mendelian disorders, complex diseases and cancers.
引用
收藏
页码:256 / 278
页数:23
相关论文
共 50 条
  • [1] Comparison of Two Variant Analysis Programs for Next-Generation Sequencing Data of Whole Mitochondrial Genome
    Lee, Seung Eun
    Kim, Ga Eun
    Kim, Hajin
    Chung, Doo Hyun
    Lee, Soong Deok
    Kim, Moon-Young
    [J]. JOURNAL OF KOREAN MEDICAL SCIENCE, 2023, 38 (36) : 1 - 13
  • [2] SNVerGUI: a desktop tool for variant analysis of next-generation sequencing data
    Wang, Wei
    Hu, Weicheng
    Hou, Fang
    Hu, Pingzhao
    Wei, Zhi
    [J]. JOURNAL OF MEDICAL GENETICS, 2012, 49 (12) : 753 - 755
  • [3] Evaluating Variant Calling Tools for Non-Matched Next-Generation Sequencing Data
    Sarah Sandmann
    Aniek O. de Graaf
    Mohsen Karimi
    Bert A. van der Reijden
    Eva Hellström-Lindberg
    Joop H. Jansen
    Martin Dugas
    [J]. Scientific Reports, 7
  • [4] Evaluating Variant Calling Tools for Non-Matched Next-Generation Sequencing Data
    Sandmann, Sarah
    de Graaf, Aniek O.
    Karimi, Mohsen
    van der Reijden, Bert A.
    Hellstrom-Lindberg, Eva
    Jansen, Joop H.
    Dugas, Martin
    [J]. SCIENTIFIC REPORTS, 2017, 7
  • [5] Cloud-Based Tools for Next-Generation Sequencing Data Analysis
    Baker, Qanita Bani
    Al-Rashdan, Wesam
    Jararweh, Yaser
    [J]. 2018 FIFTH INTERNATIONAL CONFERENCE ON SOCIAL NETWORKS ANALYSIS, MANAGEMENT AND SECURITY (SNAMS), 2018, : 99 - 105
  • [6] SNPAAMapper: An efficient genome-wide SNP variant analysis pipeline for next-generation sequencing data
    Bai, Yongsheng
    Cavalcoli, James
    [J]. BIOINFORMATION, 2013, 9 (17) : 870 - 872
  • [7] The Genome Assembly Model for Next-Generation Sequencing Data
    Wang, Yirong
    Wei, Chengdong
    Zhang, Xiaodong
    Cen, Tailin
    [J]. PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON APPLIED MATHEMATICS, MODELLING AND STATISTICS APPLICATION (AMMSA 2017), 2017, 141 : 97 - 101
  • [8] First genome survey and repeatome analysis ofChrysopogon zizanioidesbased on next-generation sequencing
    Yang, Shuqiong
    Chen, Jibao
    Zhang, Jun
    Liu, Jiafei
    Yu, Jingjing
    Cai, Debao
    Yao, Lunguang
    Duan, Pengfei
    [J]. BIOLOGIA, 2020, 75 (09) : 1273 - 1282
  • [9] An integrative variant analysis suite for whole exome next-generation sequencing data
    Challis, Danny
    Yu, Jin
    Evani, Uday S.
    Jackson, Andrew R.
    Paithankar, Sameer
    Coarfa, Cristian
    Milosavljevic, Aleksandar
    Gibbs, Richard A.
    Yu, Fuli
    [J]. BMC BIOINFORMATICS, 2012, 13
  • [10] An integrative variant analysis suite for whole exome next-generation sequencing data
    Danny Challis
    Jin Yu
    Uday S Evani
    Andrew R Jackson
    Sameer Paithankar
    Cristian Coarfa
    Aleksandar Milosavljevic
    Richard A Gibbs
    Fuli Yu
    [J]. BMC Bioinformatics, 13