ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data

被引:10475
|
作者
Wang, Kai [1 ]
Li, Mingyao [2 ]
Hakonarson, Hakon [1 ,3 ]
机构
[1] Childrens Hosp Philadelphia, Ctr Appl Genom, Philadelphia, PA 19104 USA
[2] Univ Penn, Dept Biostat & Epidemiol, Philadelphia, PA 19104 USA
[3] Univ Penn, Dept Pediat, Philadelphia, PA 19104 USA
关键词
SNPS; ASSOCIATION; GENOMES;
D O I
10.1093/nar/gkq603
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
High-throughput sequencing platforms are generating massive amounts of genetic variation data for diverse genomes, but it remains a challenge to pinpoint a small subset of functionally important variants. To fill these unmet needs, we developed the ANNOVAR tool to annotate single nucleotide variants (SNVs) and insertions/deletions, such as examining their functional consequence on genes, inferring cytogenetic bands, reporting functional importance scores, finding variants in conserved regions, or identifying variants reported in the 1000 Genomes Project and dbSNP. ANNOVAR can utilize annotation databases from the UCSC Genome Browser or any annotation data set conforming to Generic Feature Format version 3 (GFF3). We also illustrate a 'variants reduction' protocol on 4.7 million SNVs and indels from a human genome, including two causal mutations for Miller syndrome, a rare recessive disease. Through a stepwise procedure, we excluded variants that are unlikely to be causal, and identified 20 candidate genes including the causal gene. Using a desktop computer, ANNOVAR requires similar to 4 min to perform gene-based annotation and similar to 15 min to perform variants reduction on 4.7 million variants, making it practical to handle hundreds of human genomes in a day. ANNOVAR is freely available at http://www.openbioinformatics.org/annovar/.
引用
收藏
页数:7
相关论文
共 50 条
  • [41] Savant: genome browser for high-throughput sequencing data
    Fiume, Marc
    Williams, Vanessa
    Brook, Andrew
    Brudno, Michael
    BIOINFORMATICS, 2010, 26 (16) : 1938 - 1944
  • [42] LysoPhD: predicting functional prophages in bacterial genomes from high-throughput sequencing
    Niu, Qi
    Peng, Shao-liang
    Zhang, Xiang-li-lan
    Li, Shuai-cheng
    Xu, Ying
    Xie, Xiang-cheng
    Tong, Yi-Gang
    2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 288 - 292
  • [43] Comparison of high-throughput sequencing data compression tools
    Ibrahim Numanagić
    James K Bonfield
    Faraz Hach
    Jan Voges
    Jörn Ostermann
    Claudio Alberti
    Marco Mattavelli
    S Cenk Sahinalp
    Nature Methods, 2016, 13 : 1005 - 1008
  • [44] Quality assessment and control of high-throughput sequencing data
    Watson, Mick
    FRONTIERS IN GENETICS, 2014, 5
  • [45] High-throughput functional annotation of natural products by integrated activity profiling
    Hight, Suzie K.
    Clark, Trevor N.
    Kurita, Kenji L.
    McMillan, Elizabeth A.
    Bray, Walter
    Shaikh, Anam F.
    Khadilkar, Aswad
    Haeckl, F. P. Jake
    Carnevale-Neto, Fausto
    La, Scott
    Lohith, Akshar
    Vaden, Rachel M.
    Lee, Jeon
    Wei, Shuguang
    Lokey, R. Scott
    White, Michael A.
    Linington, Roger G.
    MacMillan, John B.
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2022, 119 (49)
  • [46] Next-Generation High-Throughput Functional Annotation of Microbial Genomes
    Baric, Ralph S.
    Crosson, Sean
    Damania, Blossom
    Miller, Samuel I.
    Rubin, Eric J.
    MBIO, 2016, 7 (05):
  • [47] Metalloproteomics: High-throughput structural and functional annotation of proteins in structural genomics
    Shi, WX
    Zhan, CY
    Ignatov, A
    Manjasetty, BA
    Marinkovic, N
    Sullivan, M
    Huang, R
    Chance, MR
    STRUCTURE, 2005, 13 (10) : 1473 - 1486
  • [48] Genomics - from Neanderthals to high-throughput sequencing
    Matthew John Wakefield
    Genome Biology, 7
  • [49] Genomics - from Neanderthals to high-throughput sequencing
    Wakefield, Matthew John
    GENOME BIOLOGY, 2006, 7 (08)
  • [50] High-Throughput Multiplex Sequencing to Discover Copy Number Variants in Drosophila
    Daines, Bryce
    Wang, Hui
    Li, Yumei
    Han, Yi
    Gibbs, Richard
    Chen, Rui
    GENETICS, 2009, 182 (04) : 935 - 941