Multivariate Soft Rank via Entropy-Regularized Optimal Transport: Sample Efficiency and Generative Modeling

被引：0

作者：

Bin Masud, Shoaib ^{[1
]}

Werenski, Matthew ^{[2
]}

Murphy, James M. ^{[3
]}

Aeron, Shuchin ^{[1
]}

机构：

[1] Tufts Univ, Dept Elect & Comp Engn, Medford, MA 02155 USA

[2] Tufts Univ, Dept Comp Sci, Medford, MA 02155 USA

[3] Tufts Univ, Dept Math, Medford, MA 02155 USA

来源：

JOURNAL OF MACHINE LEARNING RESEARCH | 2023年 / 24卷

关键词：

optimal transport; multivariate rank; high-dimensional statistics; goodness-of-fit testing; generative modeling; knockoff filtering; FALSE DISCOVERY RATE; DISEASE;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The framework of optimal transport has been leveraged to extend the notion of rank to the multivariate setting as corresponding to an optimal transport map, while preserving desirable properties of the resulting goodness-of-fit (GoF) statistics. In particular, the rank energy (RE) and rank maximum mean discrepancy (RMMD) are distribution-free under the null, exhibit high power in statistical testing, and are robust to outliers. In this paper, we point to and alleviate some of the shortcomings of these GoF statistics that are of practical significance, namely high computational cost, curse of dimensionality in statistical sample complexity, and lack of differentiability with respect to the data. We show that all these issues are addressed by defining multivariate rank as an entropic transport map derived from the entropic regularization of the optimal transport problem, which we refer to as the soft rank. We consequently propose two new statistics, the soft rank energy (sRE) and soft rank maximum mean discrepancy (sRMMD). Given n sample data points, we provide non-asymptotic convergence rates for the sample estimate of the entropic transport map to its population version that are essentially of the order n-1/2 when the source measure is subgaussian and the target measure has compact support. This result is novel compared to existing results which achieve a rate of n-1 but crucially rely on both measures having compact support. In contrast, the corresponding convergence rate of estimating an optimal transport map, and hence the rank map, is exponential in the data dimension. We leverage these fast convergence rates to show that the sample estimates of sRE and sRMMD converge rapidly to their population versions. Combined with the computational efficiency of methods in solving the entropy-regularized optimal transport problem, these results enable efficient rank-based GoF statistical computation, even in high dimensions. Furthermore, the sample estimates of sRE and sRMMD are differentiable with respect to the data and amenable to popular machine learning frameworks that rely on gradient methods. We leverage these properties towards showcasing their utility for generative modeling on two important problems: image generation and generating valid knockoffs for controlled feature selection.

引用

页数：65

共 13 条

[1] ENTROPY-REGULARIZED OPTIMAL TRANSPORT GENERATIVE MODELS
Liu, Dong
Minh Thanh Vu
Chatterjee, Saikat
Rasmussen, Lars K.
[J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 3532 - 3536
[2] Differentiable Particle Filtering via Entropy-Regularized Optimal Transport
Corenflos, Adrien
Thornton, James
Deligiannidis, George
Doucet, Arnaud
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[3] Entropy-Regularized Optimal Transport on Multivariate Normal and q-normal Distributions
Tong, Qijun
Kobayashi, Kei
[J]. ENTROPY, 2021, 23 (03) : 1 - 20
[4] Greedy stochastic algorithms for entropy-regularized optimal transport problems
Abid, Brahim Khalil
Gower, Robert M.
[J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
[5] Control and Estimation of Ensembles via Structured Optimal Transport A COMPUTATIONAL APPROACH BASED ON ENTROPY-REGULARIZED MULTIMARGINAL OPTIMAL TRANSPORT
Haasler, Isabel
Karlsson, Johan
Ringh, Axel
[J]. IEEE CONTROL SYSTEMS MAGAZINE, 2021, 41 (04): : 50 - 69
[6] A universal network strategy for lightspeed computation of entropy-regularized optimal transport
Shi, Yong
Zheng, Lei
Quan, Pei
Xiao, Yang
Niu, Lingfeng
[J]. Neural Networks, 2025, 184
[7] Convergence rate of entropy-regularized multi-marginal optimal transport costs
Nenna, Luca
Pegon, Paul
[J]. CANADIAN JOURNAL OF MATHEMATICS-JOURNAL CANADIEN DE MATHEMATIQUES, 2024,
[8] Relative entropy-regularized optimal transport on a graph: a new algorithm and an experimental comparison
Courtain, Sylvain
Guex, Guillaume
Kivimaki, Ilkka
Saerens, Marco
[J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (04) : 1365 - 1390
[9] Relative entropy-regularized optimal transport on a graph: a new algorithm and an experimental comparison
Sylvain Courtain
Guillaume Guex
Ilkka Kivimäki
Marco Saerens
[J]. International Journal of Machine Learning and Cybernetics, 2023, 14 : 1365 - 1390
[10] Central limit theorems for entropy-regularized optimal transport on finite spaces and statistical applications
Bigot, Jeremie
Cazelles, Elsa
Papadakis, Nicolas
[J]. ELECTRONIC JOURNAL OF STATISTICS, 2019, 13 (02): : 5120 - 5150

← 1 2 →