Kernel two-sample tests for manifold data

被引:0
|
作者
Cheng, Xiuyuan [1 ]
Xie, Yao [2 ]
机构
[1] Duke Univ, Dept Math, Durham, NC USA
[2] Georgia Inst Technol, Milton Stewart Sch Ind & Syst Engn, Atlanta, GA 30332 USA
关键词
Kernel methods; manifold data; Maximum Mean Discrepancy; two-sample test; GOODNESS-OF-FIT; SPECTRAL CONVERGENCE; GRAPH LAPLACIAN; PROBABILITY; STATISTICS;
D O I
10.3150/23-BEJ1685
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We present a study of a kernel-based two-sample test statistic related to the Maximum Mean Discrepancy (MMD) in the manifold data setting, assuming that high-dimensional observations are close to a low-dimensional manifold. We characterize the test level and power in relation to the kernel bandwidth, the number of samples, and the intrinsic dimensionality of the manifold. Specifically, when data densities p and q are supported on a d-dimensional sub-manifold M embedded in an m-dimensional space and are H & ouml;lder with order beta (up to 2) on M, we prove a guarantee of the test power for finite sample size n that exceeds a threshold depending on d, beta, and Delta 2 the squared L2-divergence between p and q on the manifold, and with a properly chosen kernel bandwidth gamma. For small density departures, we show that with large n they can be detected by the kernel test when Delta 2 is greater than n-2 beta/(d+4 beta) up to a certain constant and gamma scales as n-1/(d+4 beta). The analysis extends to cases where the manifold has a boundary and the data samples contain high-dimensional additive noise. Our results indicate that the kernel two-sample test has no curse-of-dimensionality when the data lie on or near a low-dimensional manifold. We validate our theory and the properties of the kernel test for manifold data through a series of numerical experiments.
引用
收藏
页码:2572 / 2597
页数:26
相关论文
共 50 条
  • [41] One- and two-sample t tests
    Hess, Aaron S.
    Hess, John R.
    [J]. TRANSFUSION, 2017, 57 (10) : 2319 - 2320
  • [42] Saddlepoint approximations to the two-sample permutation tests
    Jing Bingyi
    [J]. Acta Mathematicae Applicatae Sinica, 1998, 14 (2) : 197 - 201
  • [43] RIGID MOTION INVARIANT TWO-SAMPLE TESTS
    Baringhaus, L.
    Franz, C.
    [J]. STATISTICA SINICA, 2010, 20 (04) : 1333 - 1361
  • [44] Optimal tests for the general two-sample problem
    Ferger, D
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2000, 74 (01) : 1 - 35
  • [45] Two-Sample Tests for Comparing Measurement Systems
    Majeske, Karl D.
    [J]. QUALITY ENGINEERING, 2012, 24 (04) : 501 - 513
  • [46] Two-sample smooth tests for the equality of distributions
    Zhou, Wen-Xin
    Zheng, Chao
    Zhang, Zhen
    [J]. BERNOULLI, 2017, 23 (02) : 951 - 989
  • [47] Two-sample tests when variances are unequal
    Neuhäuser, M
    [J]. ANIMAL BEHAVIOUR, 2002, 63 : 823 - 825
  • [48] Sample size analysis for two-sample linear rank tests
    Doll, Monika
    Klein, Ingo
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2023, 52 (24) : 8658 - 8676
  • [49] Simple and efficient adaptive two-sample tests for high-dimensional data
    Li, Jun
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2021, 50 (19) : 4428 - 4447
  • [50] Rank-based two-sample tests for paired data with missing values
    Fong, Youyi
    Huang, Ying
    Lemos, Maria P.
    Mcelrath, M. Juliana
    [J]. BIOSTATISTICS, 2018, 19 (03) : 281 - 294