Spatial cross-validation is not the right way to evaluate map accuracy

被引:127
|
作者
Wadoux, Alexandre M. J-C [1 ,2 ]
Heuvelink, Gerard B. M. [3 ]
de Bruin, Sytze [4 ]
Brus, Dick J. [5 ]
机构
[1] Univ Sydney, Sydney Inst Agr, Sydney, NSW, Australia
[2] Univ Sydney, Sch Life & Environm Sci, Sydney, NSW, Australia
[3] Wageningen Univ & Res, Soil Geog & Landscape Grp, Wageningen, Netherlands
[4] Wageningen Univ & Res, Lab Geoinformat Sci & Remote Sensing, Wageningen, Netherlands
[5] Wageningen Univ & Res, Biometris, Wageningen, Netherlands
关键词
Map quality; Model performance; Above-ground biomass; Sampling theory; Design-based; Model-based; Random forest; Design-unbiased; SAMPLING DESIGN; DATA SET;
D O I
10.1016/j.ecolmodel.2021.109692
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
For decades scientists have produced maps of biological , ecological and environmental variables. These studies commonly evaluate the map accuracy through cross-validation with the data used for calibrating the underlying mapping model. Recent studies, however, have argued that cross-validation statistics of most mapping studies are optimistically biased. They attribute these overoptimistic results to a supposed serious methodological flaw in standard cross-validation methods, namely that these methods ignore spatial autocorrelation in the data. They argue that spatial cross-validation should be used instead, and contend that standard cross-validation methods are inherently invalid in a geospatial context because of the autocorrelation present in most spatial data. Here we argue that these studies propagate a widespread misconception of statistical validation of maps. We explain that unbiased estimates of map accuracy indices can be obtained by probability sampling and design-based inference and illustrate this with a numerical experiment on large-scale above-ground biomass mapping. In our experiment, standard cross-validation (i.e., ignorin g autocorrelation) led to smaller bias than spatial cross-validation. Standard cross-validation was deficient in case of a strongly clustered dataset that had large differences in sampling density, but less so than spatial cross-validation. We conclude that spatial cross-validation methods have no theoretical underpinning and should not be used for assessing map accuracy, while standard cross-validation is deficient in case of clustered data. Model-free, design-unbiased and valid accuracy assessment is achieved with probability sampling and design-based inference. It is valid without the need to explicitly incorporate or adjust for spatial autocorrelation and perfectly suited for the validation of large scale biological, ecological and environmental maps.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] Dealing with clustered samples for assessing map accuracy by cross-validation
    de Bruin, Sytze
    Brus, Dick J.
    Heuvelink, Gerard B. M.
    Tengbergen, Tom van Ebbenhorst
    Wadoux, Alexandre M. J-C.
    ECOLOGICAL INFORMATICS, 2022, 69
  • [2] On the Accuracy of Cross-Validation in the Classification Problem
    Nedel'ko, V. M.
    BULLETIN OF IRKUTSK STATE UNIVERSITY-SERIES MATHEMATICS, 2021, 38 : 84 - 95
  • [3] The accuracy of cross-validation results in forecasting
    vonPuelz, A
    Sobol, MG
    DECISION SCIENCES, 1995, 26 (06) : 803 - 818
  • [4] Exploring the impact of spatial autocorrelation on optimistic bias in cross-validation and assessing the effectiveness of spatial cross-validation
    Yoo, Musang
    Koo, Hyeongmo
    CARTOGRAPHY AND GEOGRAPHIC INFORMATION SCIENCE, 2024,
  • [5] Spatial plus : A new cross-validation method to evaluate geospatial machine learning models
    Wang, Yanwen
    Khodadadzadeh, Mahdi
    Zurita-Milla, Raul
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2023, 121
  • [6] Multivariate Cross-Validation and Measures of Accuracy and Precision
    Ute Mueller
    Sangga Rima Roman Selia
    Raimon Tolosana-Delgado
    Mathematical Geosciences, 2023, 55 : 693 - 711
  • [7] Unsupervised stratification of cross-validation for accuracy estimation
    Diamantidis, NA
    Karlis, D
    Giakoumakis, EA
    ARTIFICIAL INTELLIGENCE, 2000, 116 (1-2) : 1 - 16
  • [8] Multivariate Cross-Validation and Measures of Accuracy and Precision
    Mueller, Ute
    Selia, Sangga Rima Roman
    Tolosana-Delgado, Raimon
    MATHEMATICAL GEOSCIENCES, 2023, 55 (05) : 693 - 711
  • [9] Spatial Cross-Validation for Globally Distributed Data
    Beigaite, Rita
    Mechenich, Michael
    Zliobaite, Indre
    DISCOVERY SCIENCE (DS 2022), 2022, 13601 : 127 - 140
  • [10] Kriging and cross-validation for massive spatial data
    Zhang, Hao
    Wang, Yong
    ENVIRONMETRICS, 2010, 21 (3-4) : 290 - 304