Multivariate Soft Rank via Entropy-Regularized Optimal Transport: Sample Efficiency and Generative Modeling

被引:0
|
作者
Bin Masud, Shoaib [1 ]
Werenski, Matthew [2 ]
Murphy, James M. [3 ]
Aeron, Shuchin [1 ]
机构
[1] Tufts Univ, Dept Elect & Comp Engn, Medford, MA 02155 USA
[2] Tufts Univ, Dept Comp Sci, Medford, MA 02155 USA
[3] Tufts Univ, Dept Math, Medford, MA 02155 USA
关键词
optimal transport; multivariate rank; high-dimensional statistics; goodness-of-fit testing; generative modeling; knockoff filtering; FALSE DISCOVERY RATE; DISEASE;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The framework of optimal transport has been leveraged to extend the notion of rank to the multivariate setting as corresponding to an optimal transport map, while preserving desirable properties of the resulting goodness-of-fit (GoF) statistics. In particular, the rank energy (RE) and rank maximum mean discrepancy (RMMD) are distribution-free under the null, exhibit high power in statistical testing, and are robust to outliers. In this paper, we point to and alleviate some of the shortcomings of these GoF statistics that are of practical significance, namely high computational cost, curse of dimensionality in statistical sample complexity, and lack of differentiability with respect to the data. We show that all these issues are addressed by defining multivariate rank as an entropic transport map derived from the entropic regularization of the optimal transport problem, which we refer to as the soft rank. We consequently propose two new statistics, the soft rank energy (sRE) and soft rank maximum mean discrepancy (sRMMD). Given n sample data points, we provide non-asymptotic convergence rates for the sample estimate of the entropic transport map to its population version that are essentially of the order n-1/2 when the source measure is subgaussian and the target measure has compact support. This result is novel compared to existing results which achieve a rate of n-1 but crucially rely on both measures having compact support. In contrast, the corresponding convergence rate of estimating an optimal transport map, and hence the rank map, is exponential in the data dimension. We leverage these fast convergence rates to show that the sample estimates of sRE and sRMMD converge rapidly to their population versions. Combined with the computational efficiency of methods in solving the entropy-regularized optimal transport problem, these results enable efficient rank-based GoF statistical computation, even in high dimensions. Furthermore, the sample estimates of sRE and sRMMD are differentiable with respect to the data and amenable to popular machine learning frameworks that rely on gradient methods. We leverage these properties towards showcasing their utility for generative modeling on two important problems: image generation and generating valid knockoffs for controlled feature selection.
引用
收藏
页数:65
相关论文
共 13 条
  • [1] ENTROPY-REGULARIZED OPTIMAL TRANSPORT GENERATIVE MODELS
    Liu, Dong
    Minh Thanh Vu
    Chatterjee, Saikat
    Rasmussen, Lars K.
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 3532 - 3536
  • [2] Differentiable Particle Filtering via Entropy-Regularized Optimal Transport
    Corenflos, Adrien
    Thornton, James
    Deligiannidis, George
    Doucet, Arnaud
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [3] Entropy-Regularized Optimal Transport on Multivariate Normal and q-normal Distributions
    Tong, Qijun
    Kobayashi, Kei
    [J]. ENTROPY, 2021, 23 (03) : 1 - 20
  • [4] Greedy stochastic algorithms for entropy-regularized optimal transport problems
    Abid, Brahim Khalil
    Gower, Robert M.
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
  • [5] Control and Estimation of Ensembles via Structured Optimal Transport A COMPUTATIONAL APPROACH BASED ON ENTROPY-REGULARIZED MULTIMARGINAL OPTIMAL TRANSPORT
    Haasler, Isabel
    Karlsson, Johan
    Ringh, Axel
    [J]. IEEE CONTROL SYSTEMS MAGAZINE, 2021, 41 (04): : 50 - 69
  • [6] A universal network strategy for lightspeed computation of entropy-regularized optimal transport
    Shi, Yong
    Zheng, Lei
    Quan, Pei
    Xiao, Yang
    Niu, Lingfeng
    [J]. Neural Networks, 2025, 184
  • [7] Convergence rate of entropy-regularized multi-marginal optimal transport costs
    Nenna, Luca
    Pegon, Paul
    [J]. CANADIAN JOURNAL OF MATHEMATICS-JOURNAL CANADIEN DE MATHEMATIQUES, 2024,
  • [8] Relative entropy-regularized optimal transport on a graph: a new algorithm and an experimental comparison
    Courtain, Sylvain
    Guex, Guillaume
    Kivimaki, Ilkka
    Saerens, Marco
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (04) : 1365 - 1390
  • [9] Relative entropy-regularized optimal transport on a graph: a new algorithm and an experimental comparison
    Sylvain Courtain
    Guillaume Guex
    Ilkka Kivimäki
    Marco Saerens
    [J]. International Journal of Machine Learning and Cybernetics, 2023, 14 : 1365 - 1390
  • [10] Central limit theorems for entropy-regularized optimal transport on finite spaces and statistical applications
    Bigot, Jeremie
    Cazelles, Elsa
    Papadakis, Nicolas
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2019, 13 (02): : 5120 - 5150