An Improved Cross-Validated Adversarial Validation Method

被引:0
|
作者
Zhang, Wen [1 ]
Liu, Zhengjiang [1 ]
Xue, Yan [2 ]
Wang, Ruibo [3 ]
Cao, Xuefei [1 ]
Li, Jihong [3 ]
机构
[1] Shanxi Univ, Sch Automat & Software Engn, Taiyuan 030006, Peoples R China
[2] Shanxi Univ, Sch Comp & Informat Technol, Taiyuan 030006, Peoples R China
[3] Shanxi Univ, Sch Modern Educ Technol, Taiyuan 030006, Peoples R China
关键词
Adversarial Validation; Cross Validation; Algorithm Comparison; Significance Testing; Distribution Shift; DATASET SHIFT; TESTS;
D O I
10.1007/978-3-031-40283-8_29
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As a widely-used strategy among Kaggle competitors, adversarial validation provides a novel selection framework of a reasonable training and validation sets. An adversarial validation heavily depends on an accurate identification of the difference between the distributions of the training and test sets released in a Kaggle competition. However, the typical adversarial validation merely uses a K-fold cross-validated point estimator to measure the difference regardless of the variation of the estimator. Therefore, the typical adversarial validation tends to produce unpromising false positive conclusions. In this study, we reconsider the adversarial validation from a perspective of algorithm comparison. Specifically, we formulate the adversarial validation into a comparison task of a well-trained classifier with a random-guessing classifier on an adversarial data set. Then, we investigate the state-of-the-art algorithm comparison methods to improve the adversarial validation method for reducing false positive conclusions. We conducted sufficient simulated and real-world experiments, and we showed the recently-proposed 5 x 2 BCV McNemar's test can significantly improve the performance of the adversarial validation method.
引用
收藏
页码:343 / 353
页数:11
相关论文
共 50 条
  • [1] Cross-Validated Tomography
    Mogilevtsev, D.
    Hradil, Z.
    Rehacek, J.
    Shchesnovich, V. S.
    PHYSICAL REVIEW LETTERS, 2013, 111 (12)
  • [2] COMB: A Hybrid Method for Cross-validated Feature Selection
    Thejas, G. S.
    Jimenez, Daniel
    Iyengar, S. S.
    Miller, Jerry
    Sunitha, N. R.
    Badrinath, Prajwal
    ACMSE 2020: PROCEEDINGS OF THE 2020 ACM SOUTHEAST CONFERENCE, 2020, : 100 - 106
  • [3] Cross-validated wavelet shrinkage
    Oh, Hee-Seok
    Kim, Donghoh
    Lee, Youngjo
    COMPUTATIONAL STATISTICS, 2009, 24 (03) : 497 - 512
  • [4] Cross-validated bagged learning
    Petersena, Maya L.
    Molinaro, Annette M.
    Sinisi, Sandra E.
    van der Laan, Mark J.
    JOURNAL OF MULTIVARIATE ANALYSIS, 2007, 98 (09) : 1693 - 1704
  • [5] Cross-validated wavelet shrinkage
    Hee-Seok Oh
    Donghoh Kim
    Youngjo Lee
    Computational Statistics, 2009, 24 : 497 - 512
  • [6] Cross-validated bagged prediction of survival
    Sinisi, Sandra E.
    Neugebauer, Romain
    van der Laan, Mark J.
    STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2006, 5
  • [7] Estimating ecosystem risks using cross-validated multiple regression and cross-validated holographic neural networks
    Findlay, CS
    Zheng, LG
    ECOLOGICAL MODELLING, 1999, 119 (01) : 57 - 72
  • [8] Prequential and cross-validated regression estimation
    Modha, DS
    Masry, E
    MACHINE LEARNING, 1998, 33 (01) : 5 - 39
  • [9] The Cross-Validated Adaptive Signature Design
    Freidlin, Boris
    Jiang, Wenyu
    Simon, Richard
    CLINICAL CANCER RESEARCH, 2010, 16 (02) : 691 - 698
  • [10] ON CROSS-VALIDATED LASSO IN HIGH DIMENSIONS
    Chetverikov, Denis
    Liao, Zhipeng
    Chernozhukov, Victor
    ANNALS OF STATISTICS, 2021, 49 (03): : 1300 - 1317