Generalized kernel two-sample tests

被引:2
|
作者
Song, Hoseung [1 ]
Chen, Hao [1 ]
机构
[1] Univ Calif Davis, Dept Stat, One Shields Ave, Davis, CA 95616 USA
基金
美国国家科学基金会;
关键词
General alternative; High-dimensional data; Nonparametric test; Permutation null distribution; MULTIVARIATE; METRICS;
D O I
10.1093/biomet/asad068
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Kernel two-sample tests have been widely used for multivariate data to test equality of distributions. However, existing tests based on mapping distributions into a reproducing kernel Hilbert space mainly target specific alternatives and do not work well for some scenarios when the dimension of the data is moderate to high due to the curse of dimensionality. We propose a new test statistic that makes use of a common pattern under moderate and high dimensions and achieves substantial power improvements over existing kernel two-sample tests for a wide range of alternatives. We also propose alternative testing procedures that maintain high power with low computational cost, offering easy off-the-shelf tools for large datasets. The new approaches are compared to other state-of-the-art tests under various settings and show good performance. We showcase the new approaches through two applications: the comparison of musks and nonmusks using the shape of molecules, and the comparison of taxi trips starting from John F. Kennedy airport in consecutive months. All proposed methods are implemented in an R package kerTests.
引用
收藏
页码:755 / 770
页数:16
相关论文
共 50 条
  • [31] Two-sample rank tests based on exceeding observations
    Eugenia Stoimenova
    Applications of Mathematics, 2007, 52 : 345 - 352
  • [32] Two-sample homogeneity tests based on divergence measures
    Wornowizki, Max
    Fried, Roland
    COMPUTATIONAL STATISTICS, 2016, 31 (01) : 291 - 313
  • [33] The Kernel Two-Sample Test vs. Brain Decoding
    Olivetti, Emanuele
    Benozzo, Danilo
    Kia, Seyed Mostafa
    Ellero, Marta
    Hartmann, Thomas
    2013 3RD INTERNATIONAL WORKSHOP ON PATTERN RECOGNITION IN NEUROIMAGING (PRNI 2013), 2013, : 128 - 131
  • [34] Combined two-sample tests for randomly censored data
    Aly, EEAA
    TATRA MOUNTAINS MATHEMATICAL PUBLICATIONS, VOL 17, 1998, : 209 - 218
  • [35] A class of sequential tests for two-sample composite hypotheses
    Gombay, Edit
    Hussein, Abdulkadir
    CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2006, 34 (02): : 217 - 232
  • [36] Two-sample rank tests under complex sampling
    Lumley, Thomas
    Scott, Alastair J.
    BIOMETRIKA, 2013, 100 (04) : 831 - 842
  • [37] Two-sample tests for multivariate functional data with applications
    Qiu, Zhiping
    Chen, Jianwei
    Zhang, Jin-Ting
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2021, 157
  • [38] Two-sample homogeneity tests based on divergence measures
    Max Wornowizki
    Roland Fried
    Computational Statistics, 2016, 31 : 291 - 313
  • [39] On two-sample mean tests under spiked covariances
    Wang, Rui
    Xu, Xingzhong
    JOURNAL OF MULTIVARIATE ANALYSIS, 2018, 167 : 225 - 249
  • [40] Rank tests for the two-sample problem when the sample sizes are random
    Abd-Rabou, AS
    Aly, EAA
    KUWAIT JOURNAL OF SCIENCE & ENGINEERING, 2001, 28 (01): : 55 - 68