Optimal transport distances are powerful tools to compare probability distributions and have found many applications in machine learning. Yet their algorithmic complexity prevents their direct use on large scale datasets. To overcome this challenge, practitioners compute these distances on minibatches {\em i.e.} they average the outcome of several smaller optimal transport problems. We propose in this paper an analysis of this practice, which effects are not well understood so far. We notably argue that it is equivalent to an implicit regularization of the original problem, with appealing properties such as unbiased estimators, gradients and a concentration bound around the expectation, but also with defects such as loss of distance property. Along with this theoretical analysis, we also conduct empirical experiments on gradient flows, GANs or color transfer that highlight the practical interest of this strategy.
翻译:最佳运输距离是比较概率分布的有力工具,在机器学习中发现了许多应用。然而,它们的算法复杂性阻止了它们直接用于大规模数据集。为了克服这一挑战,从业者在微型公交机上计算出这些距离,他们平均地计算出一些较小的最佳运输问题的结果。我们在本文件中建议分析这一实践,迄今为止,这些影响还没有得到很好的理解。我们特别认为,这相当于对原始问题的隐性调整,其吸引力性能,如公正的估计、梯度和集中性,但也包括诸如距离地产损失等缺陷。除了这一理论分析之外,我们还对梯度流、GANs或彩色转移进行了实验,以突出这一战略的实际意义。