As a widely-used strategy among Kaggle competitors, adversarial validation provides a novel selection framework of a reasonable training and validation sets. An adversarial validation heavily depends on an accurate identification of the difference between the distributions of the training and test sets released in a Kaggle competition. However, the typical adversarial validation merely uses a K-fold cross-validated point estimator to measure the difference regardless of the variation of the estimator. Therefore, the typical adversarial validation tends to produce unpromising false positive conclusions. In this study, we reconsider the adversarial validation from a perspective of algorithm comparison. Specifically, we formulate the adversarial validation into a comparison task of a well-trained classifier with a random-guessing classifier on an adversarial data set. Then, we investigate the state-of-the-art algorithm comparison methods to improve the adversarial validation method for reducing false positive conclusions. We conducted sufficient simulated and real-world experiments, and we showed the recently-proposed 5 x 2 BCV McNemar's test can significantly improve the performance of the adversarial validation method.
机构:
Univ Calif Los Angeles, Dept Econ, Los Angeles, CA 90024 USAUniv Calif Los Angeles, Dept Econ, Los Angeles, CA 90024 USA
Chetverikov, Denis
Liao, Zhipeng
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif Los Angeles, Dept Econ, Los Angeles, CA 90024 USAUniv Calif Los Angeles, Dept Econ, Los Angeles, CA 90024 USA
Liao, Zhipeng
Chernozhukov, Victor
论文数: 0引用数: 0
h-index: 0
机构:
MIT, Dept Econ, 77 Massachusetts Ave, Cambridge, MA 02139 USA
MIT, Operat Res Ctr, 77 Massachusetts Ave, Cambridge, MA 02139 USAUniv Calif Los Angeles, Dept Econ, Los Angeles, CA 90024 USA
Chernozhukov, Victor
ANNALS OF STATISTICS,
2021,
49
(03):
: 1300
-
1317