Dealing with spatial autocorrelation when learning predictive clustering trees

被引:32
|
作者
Stojanova, Daniela [1 ,2 ]
Ceci, Michelangelo [3 ]
Appice, Annalisa [3 ]
Malerba, Donato [3 ]
Dzeroski, Saso [1 ,2 ,4 ]
机构
[1] Jozef Stefan Inst, Dept Knowledge Technol, Ljubljana 1000, Slovenia
[2] Jozef Stefan Int Postgrad Sch, Ljubljana 1000, Slovenia
[3] Univ Bari Aldo Moro, Dipartimento Informat, I-70125 Bari, Italy
[4] Ctr Excellence Integrated Approaches Chem & Biol, Ljubljana 1000, Slovenia
关键词
Spatial autocorrelation; Predictive clustering trees; Machine learning; Classification and regression; CLASSIFICATION; DEPENDENCE; PATTERNS;
D O I
10.1016/j.ecoinf.2012.10.006
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
Spatial autocorrelation is the correlation among data values which is strictly due to the relative spatial proximity of the objects that the data refer to. Inappropriate treatment of data with spatial dependencies, where spatial autocorrelation is ignored, can obfuscate important insights. In this paper, we propose a data mining method that explicitly considers spatial autocorrelation in the values of the response (target) variable when learning predictive clustering models. The method is based on the concept of predictive clustering trees (PCTs), according to which hierarchies of clusters of similar data are identified and a predictive model is associated to each cluster. In particular, our approach is able to learn predictive models for both a continuous response (regression task) and a discrete response (classification task). We evaluate our approach on several real world problems of spatial regression and spatial classification. The consideration of the autocorrelation in the models improves predictions that are consistently clustered in space and that clusters try to preserve the spatial arrangement of the data, at the same time providing a multi-level insight into the spatial autocorrelation phenomenon. The evaluation of SCLUS in several ecological domains (e.g. predicting outcrossing rates within a conventional field due to the surrounding genetically modified fields, as well as predicting pollen dispersal rates from two lines of plants) confirms its capability of building spatial aware models which capture the spatial distribution of the target variable. In general, the maps obtained by using SCLUS do not require further post-smoothing of the results if we want to use them in practice. Crown Copyright (C) 2012 Published by Elsevier B.V. All rights reserved.
引用
收藏
页码:22 / 39
页数:18
相关论文
共 50 条
  • [1] Global and Local Spatial Autocorrelation in Predictive Clustering Trees
    Stojanova, Daniela
    Ceci, Michelangelo
    Appice, Annalisa
    Malerba, Donato
    Dzeroski, Saso
    [J]. DISCOVERY SCIENCE, 2011, 6926 : 307 - +
  • [2] A spatial clustering perspective on autocorrelation and regionalization
    Ferenc Csillag
    Sándor Kabos
    Tarmo K. Remmel
    [J]. Environmental and Ecological Statistics, 2008, 15 : 385 - 401
  • [3] A spatial clustering perspective on autocorrelation and regionalization
    Csillag, Ferenc
    Kabos, Sandor
    Remmel, Tarmo K.
    [J]. ENVIRONMENTAL AND ECOLOGICAL STATISTICS, 2008, 15 (04) : 385 - 401
  • [4] Ranking with predictive clustering trees
    Todorovski, L
    Blockeel, H
    Dzeroski, S
    [J]. MACHINE LEARNING: ECML 2002, 2002, 2430 : 444 - 455
  • [5] Oblique predictive clustering trees
    Stepisnik, Tomaz
    Kocev, Dragi
    [J]. KNOWLEDGE-BASED SYSTEMS, 2021, 227
  • [6] A new method for dealing simultaneously with spatial autocorrelation and spatial heterogeneity in regression models
    Geniaux, Ghislain
    Martinetti, Davide
    [J]. REGIONAL SCIENCE AND URBAN ECONOMICS, 2018, 72 : 74 - 85
  • [7] Network regression with predictive clustering trees
    Daniela Stojanova
    Michelangelo Ceci
    Annalisa Appice
    Sašo Džeroski
    [J]. Data Mining and Knowledge Discovery, 2012, 25 : 378 - 413
  • [8] Multivariate Predictive Clustering Trees for Classification
    Stepisnik, Tomaz
    Kocev, Dragi
    [J]. FOUNDATIONS OF INTELLIGENT SYSTEMS (ISMIS 2020), 2020, 12117 : 331 - 341
  • [9] Network Regression with Predictive Clustering Trees
    Stojanova, Daniela
    Ceci, Michelangelo
    Appice, Annalisa
    Dzeroski, Saso
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT III, 2011, 6913 : 333 - 348
  • [10] Network regression with predictive clustering trees
    Stojanova, Daniela
    Ceci, Michelangelo
    Appice, Annalisa
    Dzeroski, Saso
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2012, 25 (02) : 378 - 413