Optimal transport (OT) distances are increasingly used as loss functions for statistical inference, notably in the learning of generative models or supervised learning. Yet, the behavior of minimum Wasserstein estimators is poorly understood, notably in high-dimensional regimes or under model misspecification. In this work we adopt the viewpoint of projection robust (PR) OT, which seeks to maximize the OT cost between two measures by choosing a $k$-dimensional subspace onto which they can be projected. Our first contribution is to establish several fundamental statistical properties of PR Wasserstein distances, complementing and improving previous literature that has been restricted to one-dimensional and well-specified cases. Next, we propose the integral PR Wasserstein (IPRW) distance as an alternative to the PRW distance, by averaging rather than optimizing on subspaces. Our complexity bounds can help explain why both PRW and IPRW distances outperform Wasserstein distances empirically in high-dimensional inference tasks. Finally, we consider parametric inference using the PRW distance. We provide an asymptotic guarantee of two types of minimum PRW estimators and formulate a central limit theorem for max-sliced Wasserstein estimator under model misspecification. To enable our analysis on PRW with projection dimension larger than one, we devise a novel combination of variational analysis and statistical theory.
翻译:最佳运输(OT)距离日益被用作统计推论的损失函数,特别是在学习基因模型或监督学习过程中。然而,最低瓦塞斯坦估计员的行为不易理解,特别是在高维系统或模型偏差下。在这项工作中,我们采用了预测强强(PR)OT的观点,试图通过选择可以预测的美元维基空间,在两种计量之间最大限度地增加OT成本。我们的第一个贡献是建立PR 瓦塞斯坦距离的若干基本统计属性,补充和改进以前仅限于一维和明确案例的文献。接下来,我们提出将PL 瓦塞斯坦(IPRW)距离作为PRW距离的替代方法,办法是平均而不是优化亚空间。我们复杂的界限可以帮助解释为什么PRW和IPW在高维度推论任务中,从经验上比Wasserstein更接近瓦塞斯坦(PRWS)距离,我们考虑利用PRWS距离的距离等分数,补充和改进以前的文献。我们为PRWSwarterstein提供了两种最低类型PWasserstein (IPSestimal) ASestimstraudistraudiction Armagraduction) 提供了一种顶级保证。