Kernel two-sample tests for manifold data

被引:0
|
作者
Cheng, Xiuyuan [1 ]
Xie, Yao [2 ]
机构
[1] Duke Univ, Dept Math, Durham, NC USA
[2] Georgia Inst Technol, Milton Stewart Sch Ind & Syst Engn, Atlanta, GA 30332 USA
关键词
Kernel methods; manifold data; Maximum Mean Discrepancy; two-sample test; GOODNESS-OF-FIT; SPECTRAL CONVERGENCE; GRAPH LAPLACIAN; PROBABILITY; STATISTICS;
D O I
10.3150/23-BEJ1685
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We present a study of a kernel-based two-sample test statistic related to the Maximum Mean Discrepancy (MMD) in the manifold data setting, assuming that high-dimensional observations are close to a low-dimensional manifold. We characterize the test level and power in relation to the kernel bandwidth, the number of samples, and the intrinsic dimensionality of the manifold. Specifically, when data densities p and q are supported on a d-dimensional sub-manifold M embedded in an m-dimensional space and are H & ouml;lder with order beta (up to 2) on M, we prove a guarantee of the test power for finite sample size n that exceeds a threshold depending on d, beta, and Delta 2 the squared L2-divergence between p and q on the manifold, and with a properly chosen kernel bandwidth gamma. For small density departures, we show that with large n they can be detected by the kernel test when Delta 2 is greater than n-2 beta/(d+4 beta) up to a certain constant and gamma scales as n-1/(d+4 beta). The analysis extends to cases where the manifold has a boundary and the data samples contain high-dimensional additive noise. Our results indicate that the kernel two-sample test has no curse-of-dimensionality when the data lie on or near a low-dimensional manifold. We validate our theory and the properties of the kernel test for manifold data through a series of numerical experiments.
引用
收藏
页码:2572 / 2597
页数:26
相关论文
共 50 条
  • [1] Generalized kernel two-sample tests
    Song, Hoseung
    Chen, Hao
    [J]. BIOMETRIKA, 2024, 111 (03) : 755 - 770
  • [2] SPECTRAL REGULARIZED KERNEL TWO-SAMPLE TESTS
    Hagrass, Omar
    Sriperumbudur, Bharath K.
    Li, Bing
    [J]. ANNALS OF STATISTICS, 2024, 52 (03): : 1076 - 1101
  • [3] A Kernel Two-Sample Test for Functional Data
    Wynne, George
    Duncan, Andrew B.
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23
  • [4] A Kernel Two-Sample Test for Functional Data
    Wynne, George
    Duncan, Andrew B.
    [J]. Journal of Machine Learning Research, 2022, 23 : 1 - 51
  • [5] Two-sample tests for multivariate functional data
    Jiang, Qing
    Meintanis, Simos G.
    Zhu, Lixing
    [J]. FUNCTIONAL STATISTICS AND RELATED FIELDS, 2017, : 145 - 154
  • [6] Two-Sample Tests Based on Data Depth
    Shi, Xiaoping
    Zhang, Yue
    Fu, Yuejiao
    [J]. ENTROPY, 2023, 25 (02)
  • [7] A Kernel Two-Sample Test
    Gretton, Arthur
    Borgwardt, Karsten M.
    Rasch, Malte J.
    Schoelkopf, Bernhard
    Smola, Alexander
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2012, 13 : 723 - 773
  • [8] Manifold energy two-sample test
    Chu, Lynna
    Dai, Xiongtao
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2024, 18 (01): : 145 - 166
  • [9] Scalable kernel two-sample tests via empirical likelihood and jackknife
    Wen, Qian
    Yuan, Mingao
    [J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2023, 52 (12) : 5975 - 5990
  • [10] Combined two-sample tests for randomly censored data
    Aly, EEAA
    [J]. TATRA MOUNTAINS MATHEMATICAL PUBLICATIONS, VOL 17, 1998, : 209 - 218