Relative performance of cluster algorithms and validation indices in maize genome-wide structure patterns

被引:0
|
作者
María Eugenia Videla
Juliana Iglesias
Cecilia Bruno
机构
[1] Universidad Nacional de Córdoba,Estadística y Biometría. Facultad de Ciencias Agropecuarias (FCA)
[2] Consejo Nacional de Investigaciones Científicas y Tecnológicas (UFyMA -CONICET),Unidad de Fitopatología y Modelización Agrícola
[3] Universidad Nacional de Villa María,Estación Experimental Pergamino
[4] INTA,undefined
[5] Instituto Nacional de Tecnología Agropecuaria,undefined
[6] UNNOBA,undefined
[7] Universidad Nacional del Noroeste de La Provincia de Buenos Aires,undefined
来源
Euphytica | 2021年 / 217卷
关键词
Unsupervised learning; Population genetic structure; Multivariate technique; Outcome misclassification; SNPs; Maize;
D O I
暂无
中图分类号
学科分类号
摘要
A number of clustering algorithms are available to depict population genetic structure (PGS) with genomic data; however, there is no consensus on which methods are the best performing ones. We conducted a simulation study of three PGS scenarios with subpopulations k = 2, 5 and 10, recreating several maize genomes as a model to: (1) compare three well-known clustering methods: UPGMA, k-means and, Bayesian method (BM); (2) asses four internal validation indices: CH, Connectivity, Dunn and Silhouette, to determine the reliable number of groups defining a PGS; and (3) estimate the misclassification rate for each validation index. Moreover, a publicly available maize dataset was used to illustrate the outcomes of our simulation. BM was the best method to classify individuals in all tested scenarios, without assignment errors. Conversely, UPGMA was the method with the highest misclassification rate. In scenarios with 5 and 10 subpopulations, CH and Connectivity indices had the maximum underestimation of group number for all cluster algorithms. Dunn and Silhouette indices showed the best performance with BM. Nevertheless, since Silhouette measures the degree of confidence in cluster assignment, and BM measures the probability of cluster membership, these results should be considered with caution. In this study we found that BM showed to be efficient to depict the PGS in both simulated and real maize datasets. This study offers a robust alternative to unveil the existing PGS, thereby facilitating population studies and breeding strategies in maize programs. Moreover, the present findings may have implications for other crop species.
引用
收藏
相关论文
共 50 条
  • [1] Relative performance of cluster algorithms and validation indices in maize genome-wide structure patterns
    Videla, Maria Eugenia
    Iglesias, Juliana
    Bruno, Cecilia
    [J]. EUPHYTICA, 2021, 217 (10)
  • [2] Comparison of genome-wide and phenotypic selection indices in maize
    Gustavo Vitti Môro
    Mateus Figueiredo Santos
    Cláudio Lopes de Souza Júnior
    [J]. Euphytica, 2019, 215
  • [3] Comparison of genome-wide and phenotypic selection indices in maize
    Moro, Gustavo Vitti
    Santos, Mateus Figueiredo
    de Souza Junior, Claudio Lopes
    [J]. EUPHYTICA, 2019, 215 (04)
  • [4] Cluster analysis and display of genome-wide expression patterns
    Eisen, MB
    Spellman, PT
    Brown, PO
    Botstein, D
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) : 14863 - 14868
  • [5] Genome-Wide Analysis and Expression Patterns of the YUCCA Genes in Maize
    Wenlan Li
    Xiangyu Zhao
    Xiansheng Zhang
    [J]. Journal of Genetics and Genomics, 2015, 42 (12) : 707 - 710
  • [6] Genome-Wide Analysis and Expression Patterns of the YUCCA Genes in Maize
    Li, Wenlan
    Zhao, Xiangyu
    Zhang, Xiansheng
    [J]. JOURNAL OF GENETICS AND GENOMICS, 2015, 42 (12) : 707 - 710
  • [7] Genome-wide nucleotide patterns and potential mechanisms of genome divergence following domestication in maize and soybean
    Wang, Jinyu
    Li, Xianran
    Kim, Kyung Do
    Scanlon, Michael J.
    Jackson, Scott A.
    Springer, Nathan M.
    Yu, Jianming
    [J]. GENOME BIOLOGY, 2019, 20 (1)
  • [8] Genome-wide nucleotide patterns and potential mechanisms of genome divergence following domestication in maize and soybean
    Jinyu Wang
    Xianran Li
    Kyung Do Kim
    Michael J. Scanlon
    Scott A. Jackson
    Nathan M. Springer
    Jianming Yu
    [J]. Genome Biology, 20
  • [9] Genome-wide patterns of genetic variation among elite maize inbred lines
    Jinsheng Lai
    Ruiqiang Li
    Xun Xu
    Weiwei Jin
    Mingliang Xu
    Hainan Zhao
    Zhongkai Xiang
    Weibin Song
    Kai Ying
    Mei Zhang
    Yinping Jiao
    Peixiang Ni
    Jianguo Zhang
    Dong Li
    Xiaosen Guo
    Kaixiong Ye
    Min Jian
    Bo Wang
    Huisong Zheng
    Huiqing Liang
    Xiuqing Zhang
    Shoucai Wang
    Shaojiang Chen
    Jiansheng Li
    Yan Fu
    Nathan M Springer
    Huanming Yang
    Jian Wang
    Jingrui Dai
    Patrick S Schnable
    Jun Wang
    [J]. Nature Genetics, 2010, 42 : 1027 - 1030
  • [10] Genome-wide patterns of genetic variation among elite maize inbred lines
    Lai, Jinsheng
    Li, Ruiqiang
    Xu, Xun
    Jin, Weiwei
    Xu, Mingliang
    Zhao, Hainan
    Xiang, Zhongkai
    Song, Weibin
    Ying, Kai
    Zhang, Mei
    Jiao, Yinping
    Ni, Peixiang
    Zhang, Jianguo
    Li, Dong
    Guo, Xiaosen
    Ye, Kaixiong
    Jian, Min
    Wang, Bo
    Zheng, Huisong
    Liang, Huiqing
    Zhang, Xiuqing
    Wang, Shoucai
    Chen, Shaojiang
    Li, Jiansheng
    Fu, Yan
    Springer, Nathan M.
    Yang, Huanming
    Wang, Jian
    Dai, Jingrui
    Schnable, Patrick S.
    Wang, Jun
    [J]. NATURE GENETICS, 2010, 42 (11) : 1027 - +