We consider the problem of estimating the optimal transport map between two probability distributions, $P$ and $Q$ in $\mathbb R^d$, on the basis of i.i.d. samples. All existing statistical analyses of this problem require the assumption that the transport map is Lipschitz, a strong requirement that, in particular, excludes any examples where the transport map is discontinuous. As a first step towards developing estimation procedures for discontinuous maps, we consider the important special case where the data distribution $Q$ is a discrete measure supported on a finite number of points in $\mathbb R^d$. We study a computationally efficient estimator initially proposed by Pooladian and Niles-Weed (2021), based on entropic optimal transport, and show in the semi-discrete setting that it converges at the minimax-optimal rate $n^{-1/2}$, independent of dimension. Other standard map estimation techniques both lack finite-sample guarantees in this setting and provably suffer from the curse of dimensionality. We confirm these results in numerical experiments, and provide experiments for other settings, not covered by our theory, which indicate that the entropic estimator is a promising methodology for other discontinuous transport map estimation problems.
翻译:我们考虑了在两种概率分布之间估算最佳运输地图的问题,即P$和Q$$$美元之间的最佳运输地图,其依据是i.d.样本。所有关于该问题的现有统计分析都要求假设运输地图是Lipschitz,这是一项强烈要求,特别是排除运输地图不连续的任何例子。作为制定不连续地图估计程序的第一步,我们考虑了一个重要的特殊案例,即数据分布Q$是支持有限数量点数($mathbbrR%d$)的一个独立尺度。我们研究了最初由Pooladian和Niles-Weed(2021年)提出的一个计算效率高的估算器,该估算器基于的是温度最佳运输,并在半分立式设置中显示,该估算器在迷你马克斯-最佳速度($n ⁇ -1/2}美元),独立于维度。其他标准地图估算技术在这一设置中既缺乏有限的保证,又可能受到维度的诅咒。我们在数字实验中证实了这些结果,并且为其他假设提供了一种不可靠的方法。