Generalized kernel two-sample tests

被引:0
|
作者
Song, Hoseung [1 ]
Chen, Hao [1 ]
机构
[1] Univ Calif Davis, Dept Stat, One Shields Ave, Davis, CA 95616 USA
基金
美国国家科学基金会;
关键词
General alternative; High-dimensional data; Nonparametric test; Permutation null distribution; MULTIVARIATE; METRICS;
D O I
10.1093/biomet/asad068
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Kernel two-sample tests have been widely used for multivariate data to test equality of distributions. However, existing tests based on mapping distributions into a reproducing kernel Hilbert space mainly target specific alternatives and do not work well for some scenarios when the dimension of the data is moderate to high due to the curse of dimensionality. We propose a new test statistic that makes use of a common pattern under moderate and high dimensions and achieves substantial power improvements over existing kernel two-sample tests for a wide range of alternatives. We also propose alternative testing procedures that maintain high power with low computational cost, offering easy off-the-shelf tools for large datasets. The new approaches are compared to other state-of-the-art tests under various settings and show good performance. We showcase the new approaches through two applications: the comparison of musks and nonmusks using the shape of molecules, and the comparison of taxi trips starting from John F. Kennedy airport in consecutive months. All proposed methods are implemented in an R package kerTests.
引用
收藏
页码:755 / 770
页数:16
相关论文
共 50 条
  • [1] SPECTRAL REGULARIZED KERNEL TWO-SAMPLE TESTS
    Hagrass, Omar
    Sriperumbudur, Bharath K.
    Li, Bing
    [J]. ANNALS OF STATISTICS, 2024, 52 (03): : 1076 - 1101
  • [2] Kernel two-sample tests for manifold data
    Cheng, Xiuyuan
    Xie, Yao
    [J]. BERNOULLI, 2024, 30 (04) : 2572 - 2597
  • [3] A Kernel Two-Sample Test
    Gretton, Arthur
    Borgwardt, Karsten M.
    Rasch, Malte J.
    Schoelkopf, Bernhard
    Smola, Alexander
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2012, 13 : 723 - 773
  • [4] Scalable kernel two-sample tests via empirical likelihood and jackknife
    Wen, Qian
    Yuan, Mingao
    [J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2023, 52 (12) : 5975 - 5990
  • [5] Bayesian Kernel Two-Sample Testing
    Zhang, Qinyi
    Wild, Veit
    Filippi, Sarah
    Flaxman, Seth
    Sejdinovic, Dino
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2022, 31 (04) : 1164 - 1176
  • [6] A Kernel Two-Sample Test for Functional Data
    Wynne, George
    Duncan, Andrew B.
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23
  • [7] Characterising transitive two-sample tests
    Lumley, Thomas
    Gillen, Daniel L.
    [J]. STATISTICS & PROBABILITY LETTERS, 2016, 109 : 118 - 123
  • [8] A Differentially Private Kernel Two-Sample Test
    Raj, Anant
    Law, Ho Chung Leon
    Sejdinovic, Dino
    Park, Mijung
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT I, 2020, 11906 : 697 - 724
  • [9] Kernel two-sample tests in high dimensions: interplay between moment discrepancy and dimension-and-sample orders
    Yan, Jian
    Zhang, Xianyang
    [J]. BIOMETRIKA, 2023, 110 (02) : 411 - 430
  • [10] Modeling and Analysis of Students' Performance Trajectories using Diffusion Maps and Kernel Two-Sample Tests
    Rabin, N.
    Golan, M.
    Singer, G.
    Kleper, D.
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2019, 85 : 492 - 503