Impact of imputation methods on the amount of genetic variation captured by a single-nucleotide polymorphism panel in soybeans

被引:24
|
作者
Xavier, A. [1 ]
Muir, William M. [2 ]
Rainey, Katy M. [1 ]
机构
[1] Purdue Univ, Dept Agron, Lilly Hall Life Sci,915 W State St, W Lafayette, IN 47907 USA
[2] Purdue Univ, Dept Anim Sci, Lilly Hall Life Sci,915 W State St, W Lafayette, IN 47907 USA
来源
BMC BIOINFORMATICS | 2016年 / 17卷
关键词
Empirical Bayes; Heritability; Genomic selection; Association studies; WHOLE-GENOME REGRESSION; INCREASES POWER; GENOTYPE DATA; PREDICTION; SELECTION; ACCURACY; MARKERS; MODEL; PLANT; ASSOCIATION;
D O I
10.1186/s12859-016-0899-7
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Success in genome-wide association studies and marker-assisted selection depends on good phenotypic and genotypic data. The more complete this data is, the more powerful will be the results of analysis. Nevertheless, there are next-generation technologies that seek to provide genotypic information in spite of great proportions of missing data. The procedures these technologies use to impute genetic data, therefore, greatly affect downstream analyses. This study aims to (1) compare the genetic variance in a single-nucleotide polymorphism panel of soybean with missing data imputed using various methods, (2) evaluate the imputation accuracy and post-imputation quality associated with these methods, and (3) evaluate the impact of imputation method on heritability and the accuracy of genome-wide prediction of soybean traits. The imputation methods we evaluated were as follows: multivariate mixed model, hidden Markov model, logical algorithm, k-nearest neighbor, single value decomposition, and random forest. We used raw genotypes from the SoyNAM project and the following phenotypes: plant height, days to maturity, grain yield, and seed protein composition. Results: We propose an imputation method based on multivariate mixed models using pedigree information. Our methods comparison indicate that heritability of traits can be affected by the imputation method. Genotypes with missing values imputed with methods that make use of genealogic information can favor genetic analysis of highly polygenic traits, but not genome-wide prediction accuracy. The genotypic matrix captured the highest amount of genetic variance when missing loci were imputed by the method proposed in this paper. Conclusions: We concluded that hidden Markov models and random forest imputation are more suitable to studies that aim analyses of highly heritable traits while pedigree-based methods can be used to best analyze traits with low heritability. Despite the notable contribution to heritability, advantages in genomic prediction were not observed by changing the imputation method. We identified significant differences across imputation methods in a dataset missing 20 % of the genotypic values. It means that genotypic data from genotyping technologies that provide a high proportion of missing values, such as GBS, should be handled carefully because the imputation method will impact downstream analysis.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] Independent target region amplification polymorphism and single-nucleotide polymorphism marker utility in genetic evaluation of sugarcane genotypes
    Devarumath, Rachayya M.
    Kalwade, Sachin B.
    Bundock, Peter
    Eliott, Frances G.
    Henry, Robert
    PLANT BREEDING, 2013, 132 (06) : 736 - 747
  • [32] A heuristic method for fast and accurate phasing and imputation of single-nucleotide polymorphism data in bi-parental plant populations
    Serap Gonen
    Valentin Wimmer
    R. Chris Gaynor
    Ed Byrne
    Gregor Gorjanc
    John M. Hickey
    Theoretical and Applied Genetics, 2018, 131 : 2345 - 2357
  • [33] A heuristic method for fast and accurate phasing and imputation of single-nucleotide polymorphism data in bi-parental plant populations
    Gonen, Serap
    Wimmer, Valentin
    Gaynor, R. Chris
    Byrne, Ed
    Gorjanc, Gregor
    Hickey, John M.
    THEORETICAL AND APPLIED GENETICS, 2018, 131 (11) : 2345 - 2357
  • [34] Single-Nucleotide Polymorphism-Based Genetic Diversity Analysis of Clinical Pseudomonas aeruginosa Isolates
    Muthukumarasamy, Uthayakumar
    Preusse, Matthias
    Kordes, Adrian
    Koska, Michal
    Schniederjans, Monika
    Khaledi, Ariane
    Haeussler, Susanne
    GENOME BIOLOGY AND EVOLUTION, 2020, 12 (04): : 396 - 406
  • [35] Development and Application of Single-Nucleotide Polymorphism (SNP) Genetic Markers for Conservation Monitoring of Burbot Populations
    Campbell, Matthew R.
    Vu, Ninh V.
    LaGrange, Amanda P.
    Hardy, Ryan S.
    Ross, Tyler J.
    Narum, Shawn R.
    TRANSACTIONS OF THE AMERICAN FISHERIES SOCIETY, 2019, 148 (03) : 661 - 670
  • [36] Genetic Diversity and Population Structure in Solanum nigrum Based on Single-Nucleotide Polymorphism (SNP) Markers
    Li, Jinhui
    Wei, Shouhui
    Huang, Zhaofeng
    Zhu, Yuyong
    Li, Longlong
    Zhang, Yixiao
    Ma, Ziqing
    Huang, Hongjuan
    AGRONOMY-BASEL, 2023, 13 (03):
  • [37] Development and implementation of nested single-nucleotide polymorphism (SNP) assays for breeding and genetic research applications
    Song, Qijian
    Quigley, Charles
    He, Ruifeng
    Wang, Dechun
    Nguyen, Henry
    Miranda, Carrie
    Li, Zenglu
    PLANT GENOME, 2024, 17 (03):
  • [38] Developing a Standardized Single Nucleotide Polymorphism Panel for Rangewide Genetic Monitoring of Bull Trout
    Bohling, Justin
    Von Bargen, Jennifer
    Piteo, Matthew
    Louden, Amelia
    Small, Maureen
    Delomas, Thomas A.
    Kovach, Ryan
    NORTH AMERICAN JOURNAL OF FISHERIES MANAGEMENT, 2021, 41 (06) : 1920 - 1931
  • [39] The clinical application of single-sperm-based single-nucleotide polymorphism haplotyping for PGT of patients with genetic diseases
    Huang, Chenyang
    Zheng, Bo
    Chen, Linjun
    Diao, Zhenyu
    Zhou, Jianjun
    REPRODUCTIVE BIOMEDICINE ONLINE, 2022, 44 (01) : 63 - 71
  • [40] IMPDH2 Genetic Polymorphism: A Promoter Single-Nucleotide Polymorphism Disrupts a Cyclic Adenosine Monophosphate Responsive Element
    Garat, Anne
    Cauffiez, Christelle
    Hamdan-Khalil, Rima
    Glowacki, Francois
    Devos, Aurore
    Leclerc, Julie
    Lionet, Arnaud
    Allorge, Delphine
    Lo-Guidice, Jean-Marc
    Broly, Franck
    GENETIC TESTING AND MOLECULAR BIOMARKERS, 2009, 13 (06) : 841 - 847