Distribution-balanced stratified cross-validation for accuracy estimation

被引：155

作者：

Zeng, XC ^{[1
]}

Martinez, TR ^{[1
]}

机构：

[1] Brigham Young Univ, Dept Comp Sci, Provo, UT 84602 USA

来源：

JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE | 2000年 / 12卷 / 01期

关键词：

cross-validation; machine learning research; true accuracy; classifier;

D O I：

10.1080/095281300146272

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Cross-validation has often been applied in machine learning research for estimating the accuracies of classifiers. In this work, we propose an extension to this method, called distribution-balanced stratified cross-validation (DBSCV), which improves the estimation quality by providing balanced intraclass distributions when partitioning a data set into multiple folds. We have tested DBSCV on nine real-world and three artificial domains using the C4.5 decision trees classifier. The results show that DBSCV performs better (has smaller biases) than the regular stratified cross-validation in most cases, especially when the number of folds is small. The analysis and experiments based on three artificial data sets also reveal that DBSCV is particularly effective when multiple intraclass clusters exist in a data set.

引用

页码：1 / 12

页数：12

共 50 条

[21] Cross-validation and the estimation of conditional probability densities
Hall, P
Racine, J
Li, Q
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2004, 99 (468) : 1015 - 1026
[22] Optimal signal estimation using cross-validation
Michigan State Univ, East Lansing, United States
IEEE Signal Process Lett, 1 (23-25):
[23] THE EFFECT OF CROSS-VALIDATION ON REGIONAL BOARD EXAMINER ACCURACY
EISNER, J
RAY, L
JOURNAL OF DENTAL RESEARCH, 1981, 60 : 511 - 511
[24] Cross-validation EM training for robust parameter estimation
Shinozaki, T.
Ostendorf, M.
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 437 - +
[25] A COMPARISON OF CROSS-VALIDATION TECHNIQUES IN DENSITY-ESTIMATION
MARRON, JS
ANNALS OF STATISTICS, 1987, 15 (01): : 152 - 162
[26] BIASED AND UNBIASED CROSS-VALIDATION IN DENSITY-ESTIMATION
SCOTT, DW
TERRELL, GR
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1987, 82 (400) : 1131 - 1146
[27] Cross-validation and the estimation of probability distributions with categorical data
Ouyang, D
Li, Q
Racine, J
JOURNAL OF NONPARAMETRIC STATISTICS, 2006, 18 (01) : 69 - 100
[28] A NOTE ON MODIFIED CROSS-VALIDATION IN DENSITY-ESTIMATION
FELUCH, W
KORONACKI, J
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 1992, 13 (02) : 143 - 151
[29] ON THE CONSISTENCY OF CROSS-VALIDATION IN NONLINEAR WAVELET REGRESSION ESTIMATION
张双林
郑忠国
ActaMathematicaScientia, 2000, (01) : 1 - 11
[30] Cross-validation for comparing multiple density estimation procedures
Lian, Heng
STATISTICS & PROBABILITY LETTERS, 2009, 79 (01) : 112 - 115

← 1 2 3 4 5 →