We analyze a number of natural estimators for the optimal transport map between two distributions and show that they are minimax optimal. We adopt the plugin approach: our estimators are simply optimal couplings between measures derived from our observations, appropriately extended so that they define functions on $\mathbb{R}^d$. When the underlying map is assumed to be Lipschitz, we show that computing the optimal coupling between the empirical measures, and extending it using linear smoothers, already gives a minimax optimal estimator. When the underlying map enjoys higher regularity, we show that the optimal coupling between appropriate nonparametric density estimates yields faster rates. Our work also provides new bounds on the risk of corresponding plugin estimators for the quadratic Wasserstein distance, and we show how this problem relates to that of estimating optimal transport maps using stability arguments for smooth and strongly convex Brenier potentials. As an application of our results, we derive a central limit theorem for a density plugin estimator of the squared Wasserstein distance, which is centered at its population counterpart when the underlying distributions have sufficiently smooth densities. In contrast to known central limit theorems for empirical estimators, this result easily lends itself to statistical inference for Wasserstein distances.
翻译:我们分析了两个分布区间最佳运输地图的一些自然估计值, 并显示它们是最优化的最小值。 我们采用了插件方法: 我们的估计值只是我们观测得出的测量尺度之间最优化的组合, 并适当扩展, 以便界定以美元表示的功能。 当基本地图被假定为利普西茨时, 我们显示, 计算实验测量值之间的最佳组合, 并使用线性光滑度来扩展它, 已经给出了一个最优的最小值最佳估计值。 当基础地图具有更高的规律性时, 我们显示适当的非对称密度估计之间的最佳组合率会更快。 我们的工作还就四方瓦塞斯坦距离相应的插数估计值的风险提供了新的界限, 从而定义了相应的四方格瓦塞斯坦距离值, 并且我们展示了这一问题与使用稳定性参数来估计最佳运输图的关联。 作为我们结果的应用, 我们为方格瓦西斯坦距离的密度插数得出一个中心限值, 以人口距离为中心点, 其中心位置为中心点, 当基本分布值的对比结果时, 将足够平稳地压低到中央。