Two-sample Testing Using Deep Learning

被引:0
|
作者
Kirchler, Matthias [1 ,2 ]
Khorasani, Shahryar [1 ]
Kloft, Marius [2 ,3 ]
Lippert, Christoph [1 ,4 ]
机构
[1] Univ Potsdam, Hasso Plattner Inst Digital Engn, Potsdam, Germany
[2] Tech Univ Kaiserslautern, Kaiserslautern, Germany
[3] Univ Southern Calif, Los Angeles, CA 90007 USA
[4] Hasso Plattner Inst Digital Hlth Mt Sinai, New York, NY USA
基金
加拿大健康研究院; 美国国家卫生研究院;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a two-sample testing procedure based on learned deep neural network representations. To this end, we define two test statistics that perform an asymptotic location test on data samples mapped onto a hidden layer. The tests are consistent and asymptotically control the type-1 error rate. Their test statistics can be evaluated in linear time (in the sample size). Suitable data representations are obtained in a data-driven way, by solving a supervised or unsupervised transfer-learning task on an auxiliary (potentially distinct) data set. If no auxiliary data is available, we split the data into two chunks: one for learning representations and one for computing the test statistic. In experiments on audio samples, natural images and three-dimensional neuroimaging data our tests yield significant decreases in type-2 error rate (up to 35 percentage points) compared to state-of-the-art two-sample tests such as kernel-methods and classifier two-sample tests.*
引用
收藏
页码:1387 / 1397
页数:11
相关论文
共 50 条
  • [31] Two-Sample Testing for Tail Copulas with an Application to Equity Indices
    Can, Sami Umut
    Einmahl, John H. J.
    Laeven, Roger J. A.
    JOURNAL OF BUSINESS & ECONOMIC STATISTICS, 2024, 42 (01) : 147 - 159
  • [32] A permutation approach for testing heterogeneity in two-sample categorical variables
    Giancristofaro, Rosa Arboretti
    Bonnini, Stefano
    Pesarin, Fortunato
    STATISTICS AND COMPUTING, 2009, 19 (02) : 209 - 216
  • [33] A Semiparametric Two-Sample Hypothesis Testing Problem for Random Graphs
    Tang, Minh
    Athreya, Avanti
    Sussman, Daniel L.
    Lyzinski, Vince
    Park, Youngser
    Priebe, Carey E.
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2017, 26 (02) : 344 - 354
  • [34] A TWO-SAMPLE TEST
    Moses, Lincoln E.
    PSYCHOMETRIKA, 1952, 17 (03) : 239 - 247
  • [35] Union-intersection permutation solution for two-sample equivalence testing
    Pesarin, Fortunato
    Salmaso, Luigi
    Carrozzo, Eleonora
    Arboretti, Rosa
    STATISTICS AND COMPUTING, 2016, 26 (03) : 693 - 701
  • [36] Two-Sample Testing on Ranked Preference Data and the Role of Modeling Assumptions
    Rastogi, Charvi
    Balakrishnan, Sivaraman
    Shah, Nihar B.
    Singh, Aarti
    JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23
  • [37] Two-Sample Testing on Ranked Preference Data and the Role of Modeling Assumptions
    Rastogi, Charvi
    Balakrishnan, Sivaraman
    Shah, Nihar B.
    Singh, Aarti
    Journal of Machine Learning Research, 2022, 23
  • [38] Two-sample Testing for Mean Functions with Incompletely Observed Functional Data
    Yan-qiu Zhou
    Yan-ling Wan
    Tao Zhang
    Acta Mathematicae Applicatae Sinica, English Series, 2020, 36 : 374 - 389
  • [39] Two-sample Testing for Mean Functions with Incompletely Observed Functional Data
    Zhou, Yan-qiu
    Wan, Yan-ling
    Zhang, Tao
    ACTA MATHEMATICAE APPLICATAE SINICA-ENGLISH SERIES, 2020, 36 (02): : 374 - 389
  • [40] Two-Sample Testing on Pairwise Comparison Data and the Role of Modeling Assumptions
    Rastogi, Charvi
    Balakrishnan, Sivaraman
    Shah, Nihar
    Singh, Aarti
    2020 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2020, : 1271 - 1276