A permutation-free kernel two-sample test

被引:0
|
作者
Shekhar, Shubhanshu [1 ]
Kim, Ilmun [2 ]
Ramdas, Aaditya [3 ]
机构
[1] Carnegie Mellon Univ, Dept Stat & Data Sci, Pittsburgh, PA 15213 USA
[2] Yonsei Univ, Dept Appl Stat, Dept Stat & Data Sci, Seoul, South Korea
[3] Carnegie Mellon Univ, Machine Learning Dept, Dept Stat & Data Sci, Pittsburgh, PA USA
关键词
STATISTICS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The kernel Maximum Mean Discrepancy (MMD) is a popular multivariate distance metric between distributions that has found utility in two-sample testing. The usual kernel-MMD test statistic is a degenerate U-statistic under the null, and thus it has an intractable limiting distribution. Hence, to design a level-alpha test, one usually selects the rejection threshold as the (1-alpha)-quantile of the permutation distribution. The resulting nonparametric test has finite-sample validity but suffers from large computational cost, since every permutation takes quadratic time. We propose the cross-MMD, a new quadratic-time MMD test statistic based on sample-splitting and studentization. We prove that under mild assumptions, the cross-MMD has a limiting standard Gaussian distribution under the null. Importantly, we also show that the resulting test is consistent against any fixed alternative, and when using the Gaussian kernel, it has minimax rate-optimal power against local alternatives. For large sample sizes, our new cross-MMD provides a significant speedup over the MMD, for only a slight loss in power.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] A Permutation-Free Kernel Independence Test
    Shekhar, Shubhanshu
    Kim, Ilmun
    Ramdas, Aaditya
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2023,
  • [2] A Kernel Two-Sample Test
    Gretton, Arthur
    Borgwardt, Karsten M.
    Rasch, Malte J.
    Schoelkopf, Bernhard
    Smola, Alexander
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2012, 13 : 723 - 773
  • [3] A Kernel Two-Sample Test for Functional Data
    Wynne, George
    Duncan, Andrew B.
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23
  • [4] A Differentially Private Kernel Two-Sample Test
    Raj, Anant
    Law, Ho Chung Leon
    Sejdinovic, Dino
    Park, Mijung
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT I, 2020, 11906 : 697 - 724
  • [5] A Kernel Two-Sample Test for Functional Data
    Wynne, George
    Duncan, Andrew B.
    [J]. Journal of Machine Learning Research, 2022, 23 : 1 - 51
  • [6] Two-Sample Test with Kernel Projected Wasserstein Distance
    Wang, Jie
    Gao, Rui
    Xie, Yao
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
  • [7] Applications of conditional power function of two-sample permutation test
    Samuh, Monjed H.
    Pesarin, Fortunato
    [J]. COMPUTATIONAL STATISTICS, 2018, 33 (04) : 1847 - 1862
  • [8] A permutation test for the two-sample right-censored model
    Wylupek, Grzegorz
    [J]. ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2021, 73 (05) : 1037 - 1061
  • [9] A permutation test for the two-sample right-censored model
    Grzegorz Wyłupek
    [J]. Annals of the Institute of Statistical Mathematics, 2021, 73 : 1037 - 1061
  • [10] Applications of conditional power function of two-sample permutation test
    Monjed H. Samuh
    Fortunato Pesarin
    [J]. Computational Statistics, 2018, 33 : 1847 - 1862