Integrating multiple data sources in species distribution modeling: a framework for data fusion

被引:170
|
作者
Pacifici, Krishna [1 ]
Reich, Brian J. [2 ]
Miller, David A. W. [3 ]
Gardner, Beth [4 ]
Stauffer, Glenn [3 ]
Singh, Susheela [2 ]
McKerrow, Alexa [5 ]
Collazo, Jaime A. [6 ]
机构
[1] North Carolina State Univ, Dept Forestry & Environm Resources, Program Fisheries Wildlife & Conservat Biol, Raleigh, NC 27695 USA
[2] North Carolina State Univ, Dept Stat, Raleigh, NC 27695 USA
[3] Penn State Univ, Dept Ecosyst Sci & Management, University Pk, PA 16802 USA
[4] Univ Washington, Sch Environm & Forest Sci, Seattle, WA 98195 USA
[5] North Carolina State Univ, US Geol Survey, Core Sci Syst Biodivers & Spatial Informat Ctr, Raleigh, NC 27695 USA
[6] North Carolina State Univ, US Geol Survey, Dept Appl Ecol, North Carolina Cooperat Fish & Wildlife Res Unit, Raleigh, NC 27695 USA
关键词
Brown-headed nuthatch; data fusion; multivariate conditional autoregressive; species distribution modeling; PRESENCE-ONLY DATA; ESTIMATING SITE OCCUPANCY; LOGISTIC-REGRESSION; REPLICATED COUNTS; POPULATION-SIZE; CITIZEN SCIENCE; MIXTURE-MODELS; ABUNDANCE; MAXENT; ERRORS;
D O I
10.1002/ecy.1710
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
The last decade has seen a dramatic increase in the use of species distribution models (SDMs) to characterize patterns of species' occurrence and abundance. Efforts to -parameterize SDMs often create a tension between the quality and quantity of data available to fit models. Estimation methods that integrate both standardized and non-standardized data types offer a potential solution to the tradeoff between data quality and quantity. Recently several authors have developed approaches for jointly modeling two sources of data (one of high quality and one of lesser quality). We extend their work by allowing for explicit spatial autocorrelation in occurrence and detection error using a Multivariate Conditional Autoregressive (MVCAR) model and develop three models that share information in a less direct manner resulting in more robust performance when the auxiliary data is of lesser quality. We describe these three new approaches ("Shared,""Correlation,""Covariates") for combining data sources and show their use in a case study of the Brown-headed Nuthatch in the Southeastern U. S. and through simulations. All three of the approaches which used the second data source improved out-of-sample predictions relative to a single data source ("Single"). When information in the second data source is of high quality, the Shared model performs the best, but the Correlation and Covariates model also perform well. When the information quality in the second data source is of lesser quality, the Correlation and Covariates model performed better suggesting they are robust alternatives when little is known about auxiliary data collected opportunistically or through citizen scientists. Methods that allow for both data types to be used will maximize the useful information available for estimating species distributions.
引用
收藏
页码:840 / 850
页数:11
相关论文
共 50 条
  • [21] Integrating multiple data sources for assessing blue whale abundance and distribution in Chilean Northern Patagonia
    Bedrinana-Romano, Luis
    Hucke-Gaete, Rodrigo
    Alejandro Viddi, Francisco
    Morales, Juan
    Williams, Rob
    Ashe, Erin
    Garces-Vargas, Jose
    Pablo Torres-Florez, Juan
    Ruiz, Jorge
    DIVERSITY AND DISTRIBUTIONS, 2018, 24 (07) : 991 - 1004
  • [22] MULTIPLE SOURCES DATA FUSION VIA DEEP FOREST
    Xia, Junshi
    Ming, Zuheng
    Iwasaki, Akira
    IGARSS 2018 - 2018 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2018, : 1722 - 1725
  • [23] A fusion-based data assimilation framework for runoff prediction considering multiple sources of precipitation
    Bahrami, Maziyar
    Talebbeydokhti, Nasser
    Rakhshandehroo, Gholamreza
    Nikoo, Mohammad Reza
    Adamowski, Jan Franklin
    HYDROLOGICAL SCIENCES JOURNAL, 2023, 68 (04) : 614 - 629
  • [24] A workflow for standardising and integrating alien species distribution data
    Seebens, Hanno
    Clarke, David A.
    Groom, Quentin
    Wilson, John R. U.
    Garcia-Berthou, Emili
    Kuehn, Ingolf
    Roige, Mariona
    Pagad, Shyama
    Essl, Franz
    Vicente, Joana
    Winter, Marten
    McGeoch, Melodie
    NEOBIOTA, 2020, (59) : 39 - 59
  • [25] ArkMAP: integrating genomic maps across species and data sources
    Trevor Paterson
    Andy Law
    BMC Bioinformatics, 14
  • [26] ArkMAP: integrating genomic maps across species and data sources
    Paterson, Trevor
    Law, Andy
    BMC BIOINFORMATICS, 2013, 14
  • [27] Integrating species and habitat data for nature conservation in Great Britain: data sources and methods
    Griffiths, GH
    Eversham, BC
    Roy, DB
    GLOBAL ECOLOGY AND BIOGEOGRAPHY, 1999, 8 (05): : 329 - 345
  • [28] Arterial incident detection integrating data from multiple sources
    Bhandari, Nikhil
    Koppelman, Frank S.
    Schofer, Joseph L.
    Sethi, Vaneet
    Ivan, John N.
    Transportation Research Record, 1995, (1510): : 60 - 69
  • [29] Principled Graph Matching Algorithms for Integrating Multiple Data Sources
    Zhang, Duo
    Rubinstein, Benjamin I. P.
    Gemmell, Jim
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (10) : 2784 - 2796
  • [30] Flusion: Integrating multiple data sources for accurate influenza predictions
    Ray, Evan L.
    Wang, Yijin
    Wolfinger, Russell D.
    Reich, Nicholas G.
    EPIDEMICS, 2025, 50