A reference haplotype panel for genome-wide imputation of short tandem repeats

被引:46
|
作者
Saini, Shubham [1 ]
Mitra, Ileena [2 ]
Mousavi, Nima [3 ]
Fotsing, Stephanie Feupe [2 ,4 ]
Gymrek, Melissa [1 ,5 ]
机构
[1] Univ Calif San Diego, Dept Comp Sci & Engn, 9500 Gilman Dr, La Jolla, CA 92093 USA
[2] Univ Calif San Diego, Bioinformat & Syst Biol Program, 9500 Gilman Dr, La Jolla, CA 92093 USA
[3] Univ Calif San Diego, Dept Elect & Comp Engn, 9500 Gilman Dr, La Jolla, CA 92093 USA
[4] Univ Calif San Diego, Dept Biomed Informat, 9500 Gilman Dr, La Jolla, CA 92093 USA
[5] Univ Calif San Diego, Dept Med, 9500 Gilman Dr, La Jolla, CA 92093 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
GENE-EXPRESSION VARIATION; LINKAGE DISEQUILIBRIUM; DNA METHYLATION; CAG REPEAT; EXPANSION; MICROSATELLITE; VARIANTS; MUTATION; DISEASE; ASSOCIATION;
D O I
10.1038/s41467-018-06694-0
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Short tandem repeats (STRs) are involved in dozens of Mendelian disorders and have been implicated in complex traits. However, genotyping arrays used in genome-wide association studies focus on single nucleotide polymorphisms (SNPs) and do not readily allow identification of STR associations. We leverage next-generation sequencing (NGS) from 479 families to create a SNP + STR reference haplotype panel. Our panel enables imputing STR genotypes into SNP array data when NGS is not available for directly genotyping STRs. Imputed genotypes achieve mean concordance of 97% with observed genotypes in an external dataset compared to 71% expected under a naive model. Performance varies widely across STRs, with near perfect concordance at bi-allelic STRs vs. 70% at highly polymorphic repeats. Imputation increases power over individual SNPs to detect STR associations with gene expression. Imputing STRs into existing SNP datasets will enable the first large-scale STR association studies across a range of complex traits.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] Population-specific reference panel improves imputation quality for genome-wide association studies conducted on the Japanese population
    Flanagan, Jack
    Liu, Xiaoxi
    Ortega-Reyes, David
    Tomizuka, Kohei
    Matoba, Nana
    Akiyama, Masato
    Koido, Masaru
    Ishigaki, Kazuyoshi
    Ashikawa, Kyota
    Takata, Sadaaki
    Shi, Mingyang
    Aoi, Tomomi
    Momozawa, Yukihide
    Ito, Kaoru
    Murakami, Yoshinori
    Matsuda, Koichi
    Matsuda, Koichi
    Yamanashi, Yuji
    Furukawa, Yoichi
    Morisaki, Takayuki
    Murakami, Yoshinori
    Kamatani, Yoichiro
    Muto, Kaori
    Nagai, Akiko
    Obara, Wataru
    Yamaji, Ken
    Takahashi, Kazuhisa
    Asai, Satoshi
    Takahashi, Yasuo
    Suzuki, Takao
    Sinozaki, Nobuaki
    Yamaguchi, Hiroki
    Minami, Shiro
    Murayama, Shigeo
    Yoshimori, Kozo
    Nagayama, Satoshi
    Obata, Daisuke
    Higashiyama, Masahiko
    Masumoto, Akihide
    Koretsune, Yukihiro
    Kamatani, Yoichiro
    Morris, Andrew P.
    Horikoshi, Momoko
    Terao, Chikashi
    COMMUNICATIONS BIOLOGY, 2024, 7 (01)
  • [32] A Comparison of Reference Panels for Imputation of Genotype Data in Genome-wide Association Studies
    Morris, Andrew
    Brocklebank, Denise
    Anderson, Carl
    GENETIC EPIDEMIOLOGY, 2009, 33 (08) : 792 - 792
  • [33] A Genomics England haplotype reference panel and imputation of UK Biobank
    Shi, Sinan
    Rubinacci, Simone
    Hu, Sile
    Moutsianas, Loukas
    Stuckey, Alex
    Need, Anna C.
    Palamara, Pier Francesco
    Caulfield, Mark
    Marchini, Jonathan
    Myers, Simon
    NATURE GENETICS, 2024, 56 (09) : 1800 - 1803
  • [34] Genome-Wide Tool for Sensitive de novo Identification and Visualisation of Interspersed and Tandem Repeats
    Kalendar, Ruslan
    Kairov, Ulykbek
    BIOINFORMATICS AND BIOLOGY INSIGHTS, 2024, 18
  • [35] LongTR: genome-wide profiling of genetic variation at tandem repeats from long reads
    Jam, Helyaneh Ziaei
    Zook, Justin M.
    Javadzadeh, Sara
    Park, Jonghun
    Sehgal, Aarushi
    Gymrek, Melissa
    GENOME BIOLOGY, 2024, 25 (01):
  • [36] Genotype imputation for genome-wide association studies
    Jonathan Marchini
    Bryan Howie
    Nature Reviews Genetics, 2010, 11 : 499 - 511
  • [37] Genotype imputation for genome-wide association studies
    Marchini, Jonathan
    Howie, Bryan
    NATURE REVIEWS GENETICS, 2010, 11 (07) : 499 - 511
  • [38] Comparison of HapMap reference panels for imputation of genotype data in genome-wide association studies
    Brocklebank, Denise
    Anderson, Carl
    Morris, Andrew
    ANNALS OF HUMAN GENETICS, 2009, 73 : 658 - 659
  • [39] A sorghum practical haplotype graph facilitates genome-wide imputation and cost-effective genomic prediction
    Jensen, Sarah E.
    Charles, Jean Rigaud
    Muleta, Kebede
    Bradbury, Peter J.
    Casstevens, Terry
    Deshpande, Santosh P.
    Gore, Michael A.
    Gupta, Rajeev
    Ilut, Daniel C.
    Johnson, Lynn
    Lozano, Roberto
    Miller, Zachary
    Ramu, Punna
    Rathore, Abhishek
    Romay, M. Cinta
    Upadhyaya, Hari D.
    Varshney, Rajeev K.
    Morris, Geoffrey P.
    Pressoir, Gael
    Buckler, Edward S.
    Ramstein, Guillaume P.
    PLANT GENOME, 2020, 13 (01):
  • [40] Author Correction: PHARP: a pig haplotype reference panel for genotype imputation
    Zhen Wang
    Zhenyang Zhang
    Zitao Chen
    Jiabao Sun
    Caiyun Cao
    Fen Wu
    Zhong Xu
    Wei Zhao
    Hao Sun
    Longyu Guo
    Zhe Zhang
    Qishan Wang
    Yuchun Pan
    Scientific Reports, 12