Integrating multiple data sources in species distribution modeling: a framework for data fusion

被引:158
|
作者
Pacifici, Krishna [1 ]
Reich, Brian J. [2 ]
Miller, David A. W. [3 ]
Gardner, Beth [4 ]
Stauffer, Glenn [3 ]
Singh, Susheela [2 ]
McKerrow, Alexa [5 ]
Collazo, Jaime A. [6 ]
机构
[1] North Carolina State Univ, Dept Forestry & Environm Resources, Program Fisheries Wildlife & Conservat Biol, Raleigh, NC 27695 USA
[2] North Carolina State Univ, Dept Stat, Raleigh, NC 27695 USA
[3] Penn State Univ, Dept Ecosyst Sci & Management, University Pk, PA 16802 USA
[4] Univ Washington, Sch Environm & Forest Sci, Seattle, WA 98195 USA
[5] North Carolina State Univ, US Geol Survey, Core Sci Syst Biodivers & Spatial Informat Ctr, Raleigh, NC 27695 USA
[6] North Carolina State Univ, US Geol Survey, Dept Appl Ecol, North Carolina Cooperat Fish & Wildlife Res Unit, Raleigh, NC 27695 USA
关键词
Brown-headed nuthatch; data fusion; multivariate conditional autoregressive; species distribution modeling; PRESENCE-ONLY DATA; ESTIMATING SITE OCCUPANCY; LOGISTIC-REGRESSION; REPLICATED COUNTS; POPULATION-SIZE; CITIZEN SCIENCE; MIXTURE-MODELS; ABUNDANCE; MAXENT; ERRORS;
D O I
10.1002/ecy.1710
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
The last decade has seen a dramatic increase in the use of species distribution models (SDMs) to characterize patterns of species' occurrence and abundance. Efforts to -parameterize SDMs often create a tension between the quality and quantity of data available to fit models. Estimation methods that integrate both standardized and non-standardized data types offer a potential solution to the tradeoff between data quality and quantity. Recently several authors have developed approaches for jointly modeling two sources of data (one of high quality and one of lesser quality). We extend their work by allowing for explicit spatial autocorrelation in occurrence and detection error using a Multivariate Conditional Autoregressive (MVCAR) model and develop three models that share information in a less direct manner resulting in more robust performance when the auxiliary data is of lesser quality. We describe these three new approaches ("Shared,""Correlation,""Covariates") for combining data sources and show their use in a case study of the Brown-headed Nuthatch in the Southeastern U. S. and through simulations. All three of the approaches which used the second data source improved out-of-sample predictions relative to a single data source ("Single"). When information in the second data source is of high quality, the Shared model performs the best, but the Correlation and Covariates model also perform well. When the information quality in the second data source is of lesser quality, the Correlation and Covariates model performed better suggesting they are robust alternatives when little is known about auxiliary data collected opportunistically or through citizen scientists. Methods that allow for both data types to be used will maximize the useful information available for estimating species distributions.
引用
收藏
页码:840 / 850
页数:11
相关论文
共 50 条
  • [1] A DATA FUSION FRAMEWORK FOR FRACTURE TOUGHNESS MODELING USING MULTIPLE SOURCES OF DATA
    Mou, Shancong
    Chen, Jialei
    Zhang, Chuck
    Wang, Ben
    PROCEEDINGS OF THE 2020 INTERNATIONAL SYMPOSIUM ON FLEXIBLE AUTOMATION (ISFA2020), 2020,
  • [2] Unity is strength: species distribution models integrating different data sources
    Fernandez-Lopez, Javier
    Acevedo, Pelayo
    Gimenez, Olivier
    ECOSISTEMAS, 2023, 32 (01):
  • [3] Integrating multiple sources of ecological data to unveil macroscale species abundance
    Fukaya, Keiichi
    Kusumoto, Buntarou
    Shiono, Takayuki
    Fujinuma, Junichi
    Kubota, Yasuhiro
    NATURE COMMUNICATIONS, 2020, 11 (01)
  • [4] Integrating multiple sources of ecological data to unveil macroscale species abundance
    Keiichi Fukaya
    Buntarou Kusumoto
    Takayuki Shiono
    Junichi Fujinuma
    Yasuhiro Kubota
    Nature Communications, 11
  • [5] Integrating multiple data sources to fit matrix population models for interacting species
    Barraquand, Frederic
    Gimenez, Olivier
    ECOLOGICAL MODELLING, 2019, 411
  • [6] A Novel Framework for Integrating Heterogeneous Data Sources through Data Exchange
    Cheng, Yin -Ting
    Chen, Ming-Chih
    SENSORS AND MATERIALS, 2023, 35 (07) : 2603 - 2618
  • [7] INTEGRATING MULTIPLE BUILT ENVIRONMENT DATA SOURCES
    Won, Jung Yeon
    Elliott, Michael R.
    Sanchez-Vaznaugh, Emma V.
    Sanchez, Brisa N.
    ANNALS OF APPLIED STATISTICS, 2023, 17 (02): : 1722 - 1739
  • [8] Integrating Multiple Data Sources for Stock Prediction
    Wu, Di
    Fung, Gabriel Pui Cheong
    Yu, Jeffrey Xu
    Liu, Zheng
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2008, PROCEEDINGS, 2008, 5175 : 77 - +
  • [9] A general framework of multiple coordinative data fusion modules for real-time and heterogeneous data sources
    Kashinath, Shafiza Ariffin
    Mostafa, Salama A.
    Lim, David
    Mustapha, Aida
    Hafit, Hanayanti
    Darman, Rozanawati
    JOURNAL OF INTELLIGENT SYSTEMS, 2021, 30 (01) : 947 - 965
  • [10] 'RISDM': species distribution modelling from multiple data sources in R
    Foster, Scott D.
    Peel, David
    Hosack, Geoffrey R.
    Hoskins, Andrew
    Mitchell, David J.
    Proft, Kirstin
    Yang, Wen-Hsi
    Uribe-Rivera, David E.
    Froese, Jens G.
    ECOGRAPHY, 2024, 2024 (06)