Addressing the issue of digital mapping of soil classes with imbalanced class observations

被引:37
|
作者
Sharififar, Amin [1 ]
Sarmadian, Fereydoon [1 ]
Malone, Brendan P. [2 ]
Minasny, Budiman [3 ]
机构
[1] Univ Tehran, Dept Soil Sci, Coll Agr & Nat Resources, Karaj, Iran
[2] CSIRO, Agr & Food, Canberra, ACT, Australia
[3] Univ Sydney, Sydney Inst Agr, Sch Life & Environm Sci, Sydney, NSW, Australia
关键词
Imbalanced classification; Digital soil mapping; Uncertainty assessment; Data resampling; Categorical soil mapping; Machine learning; CLASSIFICATION; DATASETS;
D O I
10.1016/j.geoderma.2019.05.016
中图分类号
S15 [土壤学];
学科分类号
0903 ; 090301 ;
摘要
Considering the nature of soils distribution, an important modeling issue in soil class mapping is imbalanced class observations. Imbalanced number of data in observed soil classes in an area can result in the underestimation or loss of minority classes and an overestimation of the majority classes in predictive modeling. The effect of this phenomenon is that an area of land with comparatively fewer soil profile observations could be unmapped in the digital maps. To address this problem, this paper investigated the usefulness of data pretreatment techniques called over- and under-sampling of data applied on three predictive models including decision trees (DT), random forest (RF), and multinomial logistic regression (MNLR). The study area is situated in the northwest of Iran with 452 profiles observations on a regular grid covering about 12,000 ha. This area has 8 USDA soil great groups with an imbalanced frequency distribution. Results showed that modeling using imbalanced distribution of class observation caused uncertain maps with minority classes being lost and relatively poor accuracies. After data treatment, with over- and under-sampling, all models showed significant improvement in maintaining the minority classes, in both calibration and validation evaluations. Balancing the classes led to a notable decrease in uncertainty of all 3 models by decreasing the confusion index and raising the probability of occurrence for the soil classes in the final maps. Comparing the 3 models, decision trees showed the largest calibration and validation accuracies with and without data treatment. RF has an issue of overestimation of some of the majority classes. Data resampling technique can be a useful solution for dealing with imbalanced class observations to produce more certain digital soil maps.
引用
收藏
页码:84 / 92
页数:9
相关论文
共 50 条
  • [1] Coping with imbalanced data problem in digital mapping of soil classes
    Sharififar, Amin
    Sarmadian, Fereydoon
    [J]. EUROPEAN JOURNAL OF SOIL SCIENCE, 2023, 74 (03)
  • [2] Digital mapping of soil classes using spatial extrapolation with imbalanced data
    Neyestani, Mehrnaz
    Sarmadian, Fereydoon
    Jafari, Azam
    Keshavarzi, Ali
    Sharififar, Amin
    [J]. GEODERMA REGIONAL, 2021, 26
  • [3] Digital mapping of soil classes in Algeria - A comparison of methods
    Assami, Tarek
    Hamdi-Aissa, Baelhadj
    [J]. GEODERMA REGIONAL, 2019, 16
  • [4] Digital Soil Mapping from Conventional Field Soil Observations
    Balkovic, Juraj
    Rampasekova, Zuzana
    Hutar, Vladimir
    Sobocka, Jaroslava
    Skalsky, Rastislav
    [J]. SOIL AND WATER RESEARCH, 2013, 8 (01) : 13 - 25
  • [5] Principal components as predictor variables in digital mapping of soil classes
    ten Caten, Alexandre
    Diniz Dalmolin, Ricardo Simao
    Pedron, Fabricio de Araujo
    Mendonca Santos, Maria de Lourdes
    [J]. CIENCIA RURAL, 2011, 41 (07): : 1170 - 1176
  • [6] Digital soil mapping of soil classes using decision trees in central Iran
    Taghizadeh-Mehrjardi, R.
    Minasny, B.
    McBratney, A. B.
    Triantafilis, J.
    Sarmadian, F.
    Toomanian, N.
    [J]. DIGITAL SOIL ASSESSMENTS AND BEYOND, 2012, : 197 - 202
  • [7] Farm-scale digital soil mapping of soil classes in South Africa
    Flynn, Trevan
    Rozanov, Andrei
    Ellis, Freddie
    de Clercq, Willem
    Clarke, Cathy
    [J]. SOUTH AFRICAN JOURNAL OF PLANT AND SOIL, 2022, 39 (03) : 175 - 186
  • [8] Digital soil class mapping in Brazil: a systematic review
    Coelho, Fabricio Fernandes
    Giasson, Elvio
    Campos, Alcinei Ribeiro
    Tiecher, Tales
    Ferreira Costa, Jose Janderson
    Coblinski, Joao Augusto
    [J]. SCIENTIA AGRICOLA, 2021, 78 (05):
  • [9] Incorporating taxonomic distance into spatial prediction and digital mapping of soil classes
    Minasny, Budiman
    McBratney, Alex B.
    [J]. GEODERMA, 2007, 142 (3-4) : 285 - 293
  • [10] Density of soil observations in digital soil mapping: A study in the Mayenne region, France
    Loiseau, Thomas
    Arrouays, Dominique
    Richer-de-Forges, Anne C.
    Lagacherie, Philippe
    Ducommun, Christophe
    Minasny, Budiman
    [J]. GEODERMA REGIONAL, 2021, 24