Optimal transport (OT) theory and the related $p$-Wasserstein distance ($W_p$, $p\geq 1$) are popular tools in statistics and machine learning. Recent studies have been remarking that inference based on OT and on $W_p$ is sensitive to outliers. To cope with this issue, we work on a robust version of the primal OT problem (ROBOT) and show that it defines a robust version of $W_1$, called robust Wasserstein distance, which is able to downweight the impact of outliers. We study properties of this novel distance and use it to define minimum distance estimators. Our novel estimators do not impose any moment restrictions: this allows us to extend the use of OT methods to inference on heavy-tailed distributions. We also provide statistical guarantees of the proposed estimators. Moreover, we derive the dual form of the ROBOT and illustrate its applicability to machine learning. Numerical exercises (see also the supplementary material) provide evidence of the benefits yielded by our methods.
翻译:最佳运输(OT)理论和相关的1美元W_p$Wasserstein距离(W_p$,$p\geq 1美元)是统计和机器学习中最受欢迎的工具。最近的研究指出,基于OT和$W_p$的推论对外部线十分敏感。为了解决这个问题,我们努力研究一个强健的原始OT问题版本(ROBOT),并表明它定义了一个强健的W_1美元版本,称为强健的瓦瑟斯坦距离,能够降低外部线的影响。我们研究这种新颖距离的特性,并使用它来界定最低距离估计值。我们的新颖的估测算器没有施加任何时间限制:这使我们能够扩大OT方法的使用范围,以推断重零售分布。我们还为拟议的估算器提供统计保证。此外,我们得出ROBOT的双重形式,并展示其对机器学习的实用性。数字练习(另见补充材料)为我们的方法所产生的效益提供了证据。