Spatial cross-validation is not the right way to evaluate map accuracy

被引：127

作者：

Wadoux, Alexandre M. J-C ^{[1
,2
]}

Heuvelink, Gerard B. M. ^{[3
]}

de Bruin, Sytze ^{[4
]}

Brus, Dick J. ^{[5
]}

机构：

[1] Univ Sydney, Sydney Inst Agr, Sydney, NSW, Australia

[2] Univ Sydney, Sch Life & Environm Sci, Sydney, NSW, Australia

[3] Wageningen Univ & Res, Soil Geog & Landscape Grp, Wageningen, Netherlands

[4] Wageningen Univ & Res, Lab Geoinformat Sci & Remote Sensing, Wageningen, Netherlands

[5] Wageningen Univ & Res, Biometris, Wageningen, Netherlands

来源：

ECOLOGICAL MODELLING | 2021年 / 457卷

关键词：

Map quality; Model performance; Above-ground biomass; Sampling theory; Design-based; Model-based; Random forest; Design-unbiased; SAMPLING DESIGN; DATA SET;

D O I：

10.1016/j.ecolmodel.2021.109692

中图分类号：

Q14 [生态学（生物生态学）];

学科分类号：

071012 ; 0713 ;

摘要：

For decades scientists have produced maps of biological , ecological and environmental variables. These studies commonly evaluate the map accuracy through cross-validation with the data used for calibrating the underlying mapping model. Recent studies, however, have argued that cross-validation statistics of most mapping studies are optimistically biased. They attribute these overoptimistic results to a supposed serious methodological flaw in standard cross-validation methods, namely that these methods ignore spatial autocorrelation in the data. They argue that spatial cross-validation should be used instead, and contend that standard cross-validation methods are inherently invalid in a geospatial context because of the autocorrelation present in most spatial data. Here we argue that these studies propagate a widespread misconception of statistical validation of maps. We explain that unbiased estimates of map accuracy indices can be obtained by probability sampling and design-based inference and illustrate this with a numerical experiment on large-scale above-ground biomass mapping. In our experiment, standard cross-validation (i.e., ignorin g autocorrelation) led to smaller bias than spatial cross-validation. Standard cross-validation was deficient in case of a strongly clustered dataset that had large differences in sampling density, but less so than spatial cross-validation. We conclude that spatial cross-validation methods have no theoretical underpinning and should not be used for assessing map accuracy, while standard cross-validation is deficient in case of clustered data. Model-free, design-unbiased and valid accuracy assessment is achieved with probability sampling and design-based inference. It is valid without the need to explicitly incorporate or adjust for spatial autocorrelation and perfectly suited for the validation of large scale biological, ecological and environmental maps.

引用

页数：5

共 50 条

[31] Cross-Validation Without Doing Cross-Validation in Genome-Enabled Prediction
Gianola, Daniel
Schoen, Chris-Carolin
G3-GENES GENOMES GENETICS, 2016, 6 (10): : 3107 - 3128
[32] Nearest neighbour distance matching Leave-One-Out Cross-Validation for map validation
Mila, Carles
Mateu, Jorge
Pebesma, Edzer
Meyer, Hanna
METHODS IN ECOLOGY AND EVOLUTION, 2022, 13 (06): : 1304 - 1316
[33] Cross-validation is dead. Long live cross-validation! Model validation based on resampling
Knut Baumann
Journal of Cheminformatics, 2 (Suppl 1)
[34] Validation and Cross-Validation Methods for ASCAT
Anderson, Craig
Figa-Saldana, Julia
Wilson, John Julian William
Ticconi, Francesca
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2017, 10 (05) : 2232 - 2239
[35] Cross-validation of the hierarchical organization of MAP, a self-actualization measurement scale
Gana, IK
Trouillet, R
Martin, B
Toffart, L
CANADIAN PSYCHOLOGY-PSYCHOLOGIE CANADIENNE, 2002, 43 (02): : 106 - 111
[36] Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure
Roberts, David R.
Bahn, Volker
Ciuti, Simone
Boyce, Mark S.
Elith, Jane
Guillera-Arroita, Gurutzeta
Hauenstein, Severin
Lahoz-Monfort, Jose J.
Schroeder, Boris
Thuiller, Wilfried
Warton, David I.
Wintle, Brendan A.
Hartig, Florian
Dormann, Carsten F.
ECOGRAPHY, 2017, 40 (08) : 913 - 929
[37] A rapid cross-validation computing for three-way decisions in imbalanced data
Xu, Jianfeng
Liu, Xing
Gu, Zhenzhen
Xiao, Guohui
INFORMATION SCIENCES, 2025, 707
[38] SYMPOSIUM: THE NEED AND MEANS OF CROSS-VALIDATION III. CROSS-VALIDATION OF ITEM ANALYSES
Katzell, Raymond A.
EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1951, 11 (01) : 16 - 22
[39] Three way k-fold cross-validation of resource selection functions
Wiens, Trevor S.
Dale, Brenda C.
Boyce, Mark S.
Kershaw, G. Peter
ECOLOGICAL MODELLING, 2008, 212 (3-4) : 244 - 255
[40] Cross-validation and permutations in MVPA: Validity of permutation strategies and power of cross-validation schemes
Valente, Giancarlo
Castellanos, Agustin Lage
Hausfeld, Lars
De Martino, Federico
Formisano, Elia
NEUROIMAGE, 2021, 238

← 1 2 3 4 5 →