Non-linear phylogenetic regression using regularised kernels

被引:1
|
作者
Rosas-Puchuri, Ulises [1 ]
Santaquiteria, Aintzane [1 ]
Khanmohammadi, Sina [2 ,3 ]
Solis-Lemus, Claudia [4 ]
Betancur-R, Ricardo [1 ,5 ]
机构
[1] Univ Oklahoma, Sch Biol Sci, Norman, OK 73019 USA
[2] Univ Oklahoma, Sch Comp Sci, Norman, OK USA
[3] Univ Oklahoma, Data Sci & Analyt Inst, Norman, OK USA
[4] Univ Wisconsin Madison, Wisconsin Inst Discovery, Dept Plant Pathol, Madison, WI USA
[5] Univ Calif San Diego, Scripps Inst Oceanog, La Jolla, CA USA
来源
METHODS IN ECOLOGY AND EVOLUTION | 2024年 / 15卷 / 09期
关键词
kernel ridge regression; phylogenetic comparative methods; supervised machine learning; weighted least-squares; EVOLUTION;
D O I
10.1111/2041-210X.14385
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
Phylogenetic regression is a type of generalised least squares (GLS) method that incorporates a modelled covariance matrix based on the evolutionary relationships between species (i.e. phylogenetic relationships). While this method has found widespread use in hypothesis testing via phylogenetic comparative methods, such as phylogenetic ANOVA, its ability to account for non-linear relationships has received little attention. To address this, here we implement a phylogenetic Kernel Ridge Regression (phyloKRR) method that utilises GLS in a high-dimensional feature space, employing linear combinations of phylogenetically weighted data to account for non-linearity. We analysed two biological datasets using the Radial Basis Function and linear kernel function. The first dataset contained morphometric data, while the second dataset comprised discrete trait data and diversification rates as response variable. Hyperparameter tuning of the model was achieved through cross-validation rounds in the training set. In the tested biological datasets, phyloKRR reduced the error rate (as measured by RMSE) by around 20% compared to linear-based regression when data did not exhibit linear relationships. In simulated datasets, the error rate decreased almost exponentially with the level of non-linearity. These results show that introducing kernels into phylogenetic regression analysis presents a novel and promising tool for complementing phylogenetic comparative methods. We have integrated this method into Python package named phyloKRR, which is freely available at: .
引用
收藏
页码:1611 / 1623
页数:13
相关论文
共 50 条
  • [41] Minimizing recalibration using a non-linear regression technique for thermal anemometry
    Agrawal, Rishav
    Whalley, Richard D.
    Ng, Henry C. -H.
    Dennis, David J. C.
    Poole, Robert J.
    EXPERIMENTS IN FLUIDS, 2019, 60 (07)
  • [42] Privacy Preserving Support Vector Machine using Non-linear Kernels on Hadoop Mahout
    Teo, Sin G.
    Han, Shuguo
    Lee, Vincent C. S.
    2013 IEEE 16TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE 2013), 2013, : 941 - 948
  • [43] Numerical Non-Linear Modelling Algorithm Using Radial Kernels on Local Mesh Support
    Jose Navarro-Gonzalez, Francisco
    Villacampa, Yolanda
    Cortes-Molina, Monica
    Ivorra, Salvador
    MATHEMATICS, 2020, 8 (09)
  • [44] Prediction of Rice Cultivation in India-Support Vector Regression Approach with Various Kernels for Non-Linear Patterns
    Paidipati, Kiran Kumar
    Chesneau, Christophe
    Nayana, B. M.
    Kumar, Kolla Rohith
    Polisetty, Kalpana
    Kurangi, Chinnarao
    AGRIENGINEERING, 2021, 3 (02): : 182 - 198
  • [45] ON BAYESIAN NON-LINEAR REGRESSION WITH AN ENZYME EXAMPLE
    EAVES, DM
    BIOMETRIKA, 1983, 70 (02) : 373 - 379
  • [46] APPLICATION OF STEPWISE REGRESSION TO NON-LINEAR ESTIMATION
    JENNRICH, RI
    SAMPSON, PF
    TECHNOMETRICS, 1968, 10 (01) : 63 - &
  • [47] A NON-LINEAR REGRESSION PROGRAM FOR SMALL COMPUTERS
    DUGGLEBY, RG
    ANALYTICAL BIOCHEMISTRY, 1981, 110 (01) : 9 - 18
  • [48] Stochastic Development Regression on Non-linear Manifolds
    Kuhnel, Line
    Sommer, Stefan
    INFORMATION PROCESSING IN MEDICAL IMAGING (IPMI 2017), 2017, 10265 : 53 - 64
  • [49] ACE - A NON-LINEAR REGRESSION-MODEL
    FRANK, IE
    LANTERI, S
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 1988, 3 (04) : 301 - 313
  • [50] PROGRAM FOR NON-LINEAR MULTIVARIATE REGRESSION - GCM
    SNELLA, JJ
    ECONOMETRICA, 1978, 46 (02) : 481 - 481