Novel applications of multitask learning and multiple output regression to multiple genetic trait prediction

被引:45
|
作者
He, Dan [1 ]
Kuhn, David [2 ]
Parida, Laxmi [1 ]
机构
[1] IBM TJ Watson Res, Yorktown Hts, NY 10598 USA
[2] USDA ARS, Subtrop Hort Res Stn, 13601 Old Cutler Rd, Miami, FL 33158 USA
关键词
MARKER-ASSISTED SELECTION; GENOMIC SELECTION;
D O I
10.1093/bioinformatics/btw249
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Given a set of biallelic molecular markers, such as SNPs, with genotype values encoded numerically on a collection of plant, animal or human samples, the goal of genetic trait prediction is to predict the quantitative trait values by simultaneously modeling all marker effects. Genetic trait prediction is usually represented as linear regression models. In many cases, for the same set of samples and markers, multiple traits are observed. Some of these traits might be correlated with each other. Therefore, modeling all the multiple traits together may improve the prediction accuracy. In this work, we view the multitrait prediction problem from a machine learning angle: as either a multitask learning problem or a multiple output regression problem, depending on whether different traits share the same genotype matrix or not. We then adapted multitask learning algorithms and multiple output regression algorithms to solve the multitrait prediction problem. We proposed a few strategies to improve the least square error of the prediction from these algorithms. Our experiments show thatmodelingmultiple traits together could improve the prediction accuracy for correlated traits.
引用
收藏
页码:37 / 43
页数:7
相关论文
共 50 条
  • [41] Multiple-output quantile regression neural network
    Ruiting Hao
    Xiaorong Yang
    Statistics and Computing, 2024, 34
  • [42] Multiple-output quantile regression neural network
    Hao, Ruiting
    Yang, Xiaorong
    STATISTICS AND COMPUTING, 2024, 34 (02)
  • [43] Ordinal Regression with Multiple Output CNN for Age Estimation
    Niu, Zhenxing
    Zhou, Mo
    Wang, Le
    Gao, Xinbo
    Hua, Gang
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 4920 - 4928
  • [44] A Bayesian Approach to Multiple-Output Quantile Regression
    Guggisberg, Michael
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2023, 118 (544) : 2736 - 2745
  • [45] Computing multiple-output regression quantile regions
    Paindaveine, Davy
    Siman, Miroslav
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2012, 56 (04) : 840 - 853
  • [46] THE APPLICATION OF RIDGE-REGRESSION TO MULTIPLE TRAIT SELECTION INDEXES
    XU, S
    MUIR, WM
    JOURNAL OF ANIMAL BREEDING AND GENETICS-ZEITSCHRIFT FUR TIERZUCHTUNG UND ZUCHTUNGSBIOLOGIE, 1990, 107 (02): : 81 - 88
  • [47] DESIGN OF MULTIPLE INPUT MULTIPLE OUTPUT PATCH ANTENNA FOR MULTIBAND APPLICATIONS
    Abdulrab, Waheeb Salim
    Islam, Md Rafiqul
    Habaebi, Mohamed Hadi
    Ahmed, Musse Mohamud
    PROCEEDINGS OF 6TH INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION ENGINEERING (ICCCE 2016), 2016, : 13 - 18
  • [48] Prediction of antileukemia activity of berbamine derivatives by genetic algorithm-multiple linear regression
    Nekoei, Mehdi
    Salimi, Mahmoud
    Dolatabadi, Mohsen
    Mohammadhosseini, Majid
    MONATSHEFTE FUR CHEMIE, 2011, 142 (09): : 943 - 948
  • [49] An accurate prediction method of multiple deterioration forms of tool based on multitask learning with low rank tensor constraint
    Liu, Changqing
    Ni, Jincheng
    Wan, Peng
    JOURNAL OF MANUFACTURING SYSTEMS, 2021, 58 : 193 - 204
  • [50] A Framework for Multiple Kernel Support Vector Regression and Its Applications to siRNA Efficacy Prediction
    Qiu, Shibin
    Lane, Terran
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2009, 6 (02) : 190 - 199