Multi-output Gaussian processes for species distribution modelling

被引:17
|
作者
Ingram, Martin [1 ]
Vukcevic, Damjan [2 ,3 ]
Golding, Nick [1 ]
机构
[1] Univ Melbourne, Sch BioSci, Parkville, Vic, Australia
[2] Univ Melbourne, Sch Math & Stat, Parkville, Vic, Australia
[3] Univ Melbourne, Melbourne Integrat Genom, Parkville, Vic, Australia
来源
METHODS IN ECOLOGY AND EVOLUTION | 2020年 / 11卷 / 12期
基金
澳大利亚研究理事会;
关键词
Gaussian process; multi-species modelling; species distribution models; ECOLOGICAL THEORY; PREDICTION; TRAITS;
D O I
10.1111/2041-210X.13496
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
Species distribution modelling is an active area of research in ecology. In recent years, interest has grown in modelling multiple species simultaneously, partly due to the ability to 'borrow strength' from similar species to improve predictions. Mixed and hierarchical models allow this but typically assume a (generalised) linear relationship between covariates and species presence and absence. On the other hand, popular machine learning techniques such as random forests and boosted regression trees are able to model complex nonlinear relationships but consider only one species at a time. We apply multi-output Gaussian processes (MOGPs) to the problem of species distribution modelling. MOGPs model each species' response to the environment as a weighted sum of a small number of nonlinear functions, each modelled by a Gaussian process. While Gaussian process models are notoriously computationally intensive, recent techniques from the machine learning literature as well as using graphics processing units (GPUs) allow us to scale the model to datasets with hundreds of species at thousands of sites. We evaluate the MOGP against four baseline models on six different datasets. Overall, the MOGP is competitive with the best single-species and joint-species models, while being much faster to fit. On single-species metrics (AUC and log likelihood), the MOGP and single-output GPs outperformed tree-based models (random forest and boosted regression trees) and a joint species distribution model (JSDM). Compared to single-output GPs, the MOGP generally has a higher AUC for rare species with fewer than 50 observation in the dataset. When evaluated using joint-species log likelihood, the MOGP outperforms all models apart from the JSDM, which has a better joint likelihood on three datasets and similar performance on the three others. A key advantage of the MOGP is speed: on the largest dataset, it is around 18 times faster than fitting single output GPs, and over 80 times faster to fit than the JSDM. Our results suggest that both MOGPs and SOGPs are accurate predictive models of species distributions and that the MOGP is particularly compelling when predictions for rare species are of interest.
引用
收藏
页码:1587 / 1598
页数:12
相关论文
共 50 条
  • [1] Multi-output Gaussian processes for multi-population longevity modelling
    Huynh, Nhan
    Ludkovski, Mike
    [J]. ANNALS OF ACTUARIAL SCIENCE, 2021, 15 (02) : 318 - 345
  • [2] Collaborative Multi-output Gaussian Processes
    Nguyen, Trung V.
    Bonilla, Edwin V.
    [J]. UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2014, : 643 - 652
  • [3] Federated Multi-Output Gaussian Processes
    Chung, Seokhyun
    Al Kontar, Raed
    [J]. TECHNOMETRICS, 2024, 66 (01) : 90 - 103
  • [4] Multi-output Infinite Horizon Gaussian Processes
    Lim, Jaehyun
    Park, Jehyun
    Nah, Sungjae
    Choi, Jongeun
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 1542 - 1549
  • [5] Scalable Exact Inference in Multi-Output Gaussian Processes
    Bruinsma, Wessel P.
    Perim, Eric
    Tebbutt, Will
    Hosking, J. Scott
    Solin, Arno
    Turner, Richard E.
    [J]. 25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
  • [6] Spectral Mixture Kernels for Multi-Output Gaussian Processes
    Parra, Gabriel
    Tobar, Felipe
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [7] Safe Active Learning for Multi-Output Gaussian Processes
    Li, Cen-You
    Rakitsch, Barbara
    Zimmer, Christoph
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
  • [8] Scalable Exact Inference in Multi-Output Gaussian Processes
    Bruinsma, Wessel P.
    Perim, Eric
    Tebbutt, Will
    Hosking, J. Scott
    Solin, Arno
    Turner, Richard E.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [9] Bayesian Alignments of Warped Multi-Output Gaussian Processes
    Kaiser, Markus
    Otte, Clemens
    Runkler, Thomas
    Ek, Carl Henrik
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [10] Modeling Neonatal EEG Using Multi-Output Gaussian Processes
    Caro, Victor
    Ho, Jou-Hui
    Witting, Scarlet
    Tobar, Felipe
    [J]. IEEE ACCESS, 2022, 10 : 32912 - 32927