Partial least squares regression, support vector machine regression, and transcriptome-based distances for prediction of maize hybrid performance with gene expression data

被引:31
|
作者
Fu, Junjie [2 ]
Falke, K. Christin [3 ]
Thiemann, Alexander [4 ]
Schrag, Tobias A. [5 ]
Melchinger, Albrecht E. [5 ]
Scholten, Stefan [4 ]
Frisch, Matthias [1 ]
机构
[1] Univ Giessen, Inst Agron & Plant Breeding 2, D-35392 Giessen, Germany
[2] Chinese Acad Agr Sci, Inst Crop Sci, Beijing 100081, Peoples R China
[3] Univ Munster, Inst Evolut & Biodivers, D-48149 Munster, Germany
[4] Univ Hamburg, Bioctr Klein Flottbek, D-22609 Hamburg, Germany
[5] Univ Hohenheim, Inst Plant Breeding Seed Sci & Populat Genet, D-70593 Stuttgart, Germany
关键词
DRY-MATTER CONTENT; GRAIN-YIELD; INFORMATION; HETEROSIS;
D O I
10.1007/s00122-011-1747-9
中图分类号
S3 [农学(农艺学)];
学科分类号
0901 ;
摘要
The performance of hybrids can be predicted with gene expression data from their parental inbred lines. Implementing such prediction approaches in breeding programs promises to increase the efficiency of hybrid breeding. The objectives of our study were to compare the accuracy of prediction models employing multiple linear regression (MLR), partial least squares regression (PLS), support vector machine regression (SVM), and transcriptome-based distances (DB). For a factorial of 7 flint and 14 dent maize lines, the grain yield of the hybrids was assessed and the gene expression of the parental lines was profiled with a 56k microarray. The accuracy of the prediction models was measured by the correlation between predicted and observed yield employing two cross-validation schemes. The first modeled the prediction of hybrids when testcross data are available for both parental lines (type 2 hybrids), and the second modeled the prediction of hybrids when no testcross data for the parental lines were available (type 0 hybrids). MLR, SVM, and PLS resulted in a high correlation between predicted and observed yield for type 2 hybrids, whereas for type 0 hybrids D-B had greater prediction accuracy. The regression methods were robust to the choice of the set of profiled genes and required only a few hundred genes. In contrast, for an accurate hybrid prediction with D-B, 1,000-1,500 genes were required, and the prediction accuracy depended strongly on the set of profiled genes. We conclude that for prediction within one set of genetic material MLR is a promising approach, and for transfering prediction models from one set of genetic material to a related one, the transcriptome-based distance D-B is most promising.
引用
收藏
页码:825 / 833
页数:9
相关论文
共 50 条
  • [1] Partial least squares regression, support vector machine regression, and transcriptome-based distances for prediction of maize hybrid performance with gene expression data
    Junjie Fu
    K. Christin Falke
    Alexander Thiemann
    Tobias A. Schrag
    Albrecht E. Melchinger
    Stefan Scholten
    Matthias Frisch
    [J]. Theoretical and Applied Genetics, 2012, 124 : 825 - 833
  • [2] Support vector machine regression for the prediction of maize hybrid performance
    Maenhout, S.
    De Baets, B.
    Haesaert, G.
    Van Bockstaele, E.
    [J]. THEORETICAL AND APPLIED GENETICS, 2007, 115 (07) : 1003 - 1013
  • [3] Support vector machine regression for the prediction of maize hybrid performance
    S. Maenhout
    B. De Baets
    G. Haesaert
    E. Van Bockstaele
    [J]. Theoretical and Applied Genetics, 2007, 115 : 1003 - 1013
  • [4] Research on regional economy prediction based on partial least squares support vector regression
    Hongshan
    Ai, Junjun Shi
    [J]. International Journal of Applied Environmental Sciences, 2013, 8 (13): : 1645 - 1652
  • [5] Mapped least squares support vector machine regression
    Zheng, S
    Sun, YQ
    Tian, JW
    Liu, J
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2005, 19 (03) : 459 - 475
  • [6] Research on Application of Regression Least Squares Support Vector Machine on Performance Prediction of Hydraulic Excavator
    Chen, Zhan-bo
    [J]. JOURNAL OF CONTROL SCIENCE AND ENGINEERING, 2014, 2014 (2014)
  • [7] Fund Trend Prediction Based on least Squares Support Vector Regression
    Bao Yilan
    [J]. FOURTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2011): COMPUTER VISION AND IMAGE ANALYSIS: PATTERN RECOGNITION AND BASIC TECHNOLOGIES, 2012, 8350
  • [8] Least squares support vector machine regression with additional constrains
    Ye Hong
    Sun, Bing-Yu
    Wang, Ru Jing
    [J]. PROCEEDINGS OF 2006 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE: 50 YEARS' ACHIEVEMENTS, FUTURE DIRECTIONS AND SOCIAL IMPACTS, 2006, : 682 - 684
  • [9] Monitoring model for dam seepage based on partial least-squares regression and partial least square support vector machine
    Hohai University, Nanjing 210098, China
    不详
    [J]. Shuili Xuebao, 2008, 12 (1390-1394+1400):
  • [10] Least squares Support Vector Machine regression for discriminant analysis
    Van Gestel, T
    Suykens, JAK
    De Brabanter, J
    De Moor, B
    Vandewalle, J
    [J]. IJCNN'01: INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2001, : 2445 - 2450