Two-sample Testing Using Deep Learning

被引:0
|
作者
Kirchler, Matthias [1 ,2 ]
Khorasani, Shahryar [1 ]
Kloft, Marius [2 ,3 ]
Lippert, Christoph [1 ,4 ]
机构
[1] Univ Potsdam, Hasso Plattner Inst Digital Engn, Potsdam, Germany
[2] Tech Univ Kaiserslautern, Kaiserslautern, Germany
[3] Univ Southern Calif, Los Angeles, CA 90007 USA
[4] Hasso Plattner Inst Digital Hlth Mt Sinai, New York, NY USA
基金
加拿大健康研究院; 美国国家卫生研究院;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a two-sample testing procedure based on learned deep neural network representations. To this end, we define two test statistics that perform an asymptotic location test on data samples mapped onto a hidden layer. The tests are consistent and asymptotically control the type-1 error rate. Their test statistics can be evaluated in linear time (in the sample size). Suitable data representations are obtained in a data-driven way, by solving a supervised or unsupervised transfer-learning task on an auxiliary (potentially distinct) data set. If no auxiliary data is available, we split the data into two chunks: one for learning representations and one for computing the test statistic. In experiments on audio samples, natural images and three-dimensional neuroimaging data our tests yield significant decreases in type-2 error rate (up to 35 percentage points) compared to state-of-the-art two-sample tests such as kernel-methods and classifier two-sample tests.*
引用
收藏
页码:1387 / 1397
页数:11
相关论文
共 50 条
  • [41] Distributed hypothesis testing for large dimensional two-sample mean vectors
    Yan, Lu
    Hu, Jiang
    Wu, Lixiu
    STATISTICS AND COMPUTING, 2024, 34 (06)
  • [42] Robust two-sample statistics for testing equality of means: a simulation study
    Reed, JF
    Stark, DB
    JOURNAL OF APPLIED STATISTICS, 2004, 31 (07) : 831 - 854
  • [43] One-sided multiple endpoint testing in two-sample comparisons
    Reitmeir, P
    Wassmer, G
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 1996, 25 (01) : 99 - 117
  • [44] One-Sided Multiple Endpoint Testing in Two-Sample Comparisons
    Reitmeir, P.
    Wassmer, G.
    Communications in Statistics. Part B: Simulation and Computation, 1996, 25 (01):
  • [45] A pretest for using logrank or Wilcoxon in the two-sample problem
    Darilay, Annie Tordilla
    Naranjo, Joshua D.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2011, 55 (07) : 2400 - 2409
  • [46] Two-sample Test using Projected Wasserstein Distance
    Wang, Jie
    Gao, Rui
    Xie, Yao
    2021 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2021, : 3320 - 3325
  • [47] Two-sample Testing for Mean Functions with Incompletely Observed Functional Data
    Yan-qiu ZHOU
    Yan-ling WAN
    Tao ZHANG
    Acta Mathematicae Applicatae Sinica, 2020, 36 (02) : 374 - 389
  • [48] Robust Two-Sample Location Testing via Probability Measure Transform
    Eder, Yoni
    Todros, Koby
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2021, 69 : 4724 - 4739
  • [49] A multivariate extension of union–intersection permutation solution for two-sample testing
    Arboretti R.
    Carrozzo E.
    Pesarin F.
    Salmaso L.
    Journal of Statistical Theory and Practice, 2017, 11 (3) : 436 - 448
  • [50] Multivariate bi-aspect testing for the two-sample location problem
    Marozzi, M
    Salmaso, L
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2006, 35 (03) : 477 - 488