Optimal transport maps define a one-to-one correspondence between probability distributions, and as such have grown popular for machine learning applications. However, these maps are generally defined on empirical observations and cannot be generalized to new samples while preserving asymptotic properties. We extend a novel method to learn a consistent estimator of a continuous optimal transport map from two empirical distributions. The consequences of this work are two-fold: first, it enables to extend the transport plan to new observations without computing again the discrete optimal transport map; second, it provides statistical guarantees to machine learning applications of optimal transport. We illustrate the strength of this approach by deriving a consistent framework for transport-based counterfactual explanations in fairness.
翻译:最佳运输地图界定概率分布之间的一对一对应,因此在机器学习应用中越来越受欢迎。然而,这些地图一般是在经验观测的基础上界定的,不能在保存无药性特性的同时推广为新样品。我们推广了一种新方法,从两种经验分布中学习对连续最佳运输地图的一致估计。这项工作的结果有两个方面:第一,它能够将运输计划扩大到新的观测,而不必再计算离散最佳运输地图;第二,它为机器学习应用最佳运输提供了统计保障。我们通过公正得出基于运输的反事实解释的一致框架来说明这一方法的优点。