Generalized kernel two-sample tests

被引:2
|
作者
Song, Hoseung [1 ]
Chen, Hao [1 ]
机构
[1] Univ Calif Davis, Dept Stat, One Shields Ave, Davis, CA 95616 USA
基金
美国国家科学基金会;
关键词
General alternative; High-dimensional data; Nonparametric test; Permutation null distribution; MULTIVARIATE; METRICS;
D O I
10.1093/biomet/asad068
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Kernel two-sample tests have been widely used for multivariate data to test equality of distributions. However, existing tests based on mapping distributions into a reproducing kernel Hilbert space mainly target specific alternatives and do not work well for some scenarios when the dimension of the data is moderate to high due to the curse of dimensionality. We propose a new test statistic that makes use of a common pattern under moderate and high dimensions and achieves substantial power improvements over existing kernel two-sample tests for a wide range of alternatives. We also propose alternative testing procedures that maintain high power with low computational cost, offering easy off-the-shelf tools for large datasets. The new approaches are compared to other state-of-the-art tests under various settings and show good performance. We showcase the new approaches through two applications: the comparison of musks and nonmusks using the shape of molecules, and the comparison of taxi trips starting from John F. Kennedy airport in consecutive months. All proposed methods are implemented in an R package kerTests.
引用
收藏
页码:755 / 770
页数:16
相关论文
共 50 条
  • [41] Powerful two-sample tests based on the likelihood ratio
    Zhang, J
    TECHNOMETRICS, 2006, 48 (01) : 95 - 103
  • [42] Ensemble Subsampling for Imbalanced Multivariate Two-Sample Tests
    Chen, Lisha
    Dou, Winston Wei
    Qiao, Zhihua
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2013, 108 (504) : 1308 - 1323
  • [43] Transformation tests and their asymptotic power in two-sample comparisons
    Zhang, Huaiyu
    Wang, Haiyan
    JOURNAL OF NONPARAMETRIC STATISTICS, 2021, 33 (3-4) : 482 - 516
  • [44] Intrinsic priors for two-sample tests in normal populations
    Kim, SW
    Kim, DH
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2002, 31 (07) : 1091 - 1105
  • [45] Global and local two-sample tests via regression
    Kim, Ilmun
    Lee, Ann B.
    Lei, Jing
    ELECTRONIC JOURNAL OF STATISTICS, 2019, 13 (02): : 5253 - 5305
  • [46] Two-sample tests based on the integrated empirical process
    Henze, N
    Nikitin, YY
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2003, 32 (09) : 1767 - 1788
  • [47] Permutation tests for mixed paired and two-sample designs
    E. N. Johnson
    S. J. Richter
    Computational Statistics, 2022, 37 : 739 - 750
  • [48] Two-sample tests for interval-valued data
    Choi, Hyejeong
    Lim, Johan
    Yu, Donghyeon
    Kwak, Minjung
    JOURNAL OF THE KOREAN STATISTICAL SOCIETY, 2021, 50 (01) : 233 - 271
  • [49] Two-sample rank tests based on exceeding observations
    Stoimenova, Eugenia
    APPLICATIONS OF MATHEMATICS, 2007, 52 (04) : 345 - 352
  • [50] TWO-SAMPLE TESTS FOR GERM-GRAIN MODELS
    Benes, Viktor
    Klebanov, Lev
    Lechnerova, Radka
    ECS10: THE10TH EUROPEAN CONGRESS OF STEREOLOGY AND IMAGE ANALYSIS, 2009, : 520 - +