Multi‐Output Gaussian Processes for Species Distribution Modelling

2020 
Species distribution modelling is an active area of research in ecology. In recent years, interest has grown in modelling multiple species simultaneously, partly due to the ability to ‘borrow strength’ from similar species to improve predictions. Mixed and hierarchical models allow this but typically assume a (generalised) linear relationship between covariates and species presence and absence. On the other hand, popular machine learning techniques such as random forests and boosted regression trees are able to model complex nonlinear relationships but consider only one species at a time. We apply multi‐output Gaussian processes (MOGPs) to the problem of species distribution modelling. MOGPs model each species' response to the environment as a weighted sum of a small number of nonlinear functions, each modelled by a Gaussian process. While Gaussian process models are notoriously computationally intensive, recent techniques from the machine learning literature as well as using graphics processing units (GPUs) allow us to scale the model to datasets with hundreds of species at thousands of sites. We evaluate the MOGP against four baseline models on six different datasets. Overall, the MOGP is competitive with the best single‐species and joint‐species models, while being much faster to fit. On single‐species metrics (AUC and log likelihood), the MOGP and single‐output GPs outperformed tree‐based models (random forest and boosted regression trees) and a joint species distribution model (JSDM). Compared to single‐output GPs, the MOGP generally has a higher AUC for rare species with fewer than 50 observation in the dataset. When evaluated using joint‐species log likelihood, the MOGP outperforms all models apart from the JSDM, which has a better joint likelihood on three datasets and similar performance on the three others. A key advantage of the MOGP is speed: on the largest dataset, it is around 18 times faster than fitting single output GPs, and over 80 times faster to fit than the JSDM. Our results suggest that both MOGPs and SOGPs are accurate predictive models of species distributions and that the MOGP is particularly compelling when predictions for rare species are of interest.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    26
    References
    3
    Citations
    NaN
    KQI
    []