Optimal transport (OT) theory and the related $p$-Wasserstein distance ($W_p$, $p\geq 1$) are widely-applied in statistics and machine learning. In spite of their popularity, inference based on these tools is sensitive to outliers or it can perform poorly when the underlying model has heavy-tails. To cope with these issues, we introduce a new class of procedures. (i) We consider a robust version of the primal OT problem (ROBOT) and show that it defines the {robust Wasserstein distance}, $W^{(\lambda)}$, which depends on a tuning parameter $\lambda > 0$. (ii) We illustrate the link between $W_1$ and $W^{(\lambda)}$ and study its key measure theoretic aspects. (iii) We derive some concentration inequalities for $W^{(\lambda)}$. (iii) We use $W^{(\lambda)}$ to define minimum distance estimators, we provide their statistical guarantees and we illustrate how to apply concentration inequalities for the selection of $\lambda$. (v) We derive the {dual} form of the ROBOT and illustrate its applicability to machine learning problems (generative adversarial networks and domain adaptation). Numerical exercises provide evidence of the benefits yielded by our methods.
翻译:暂无翻译