We propose a hybrid resampling method to approximate finitely supported Wasserstein barycenters on large-scale datasets, which can be combined with any exact solver. Nonasymptotic bounds on the expected error of the objective value as well as the barycenters themselves allow to calibrate computational cost and statistical accuracy. The rate of these upper bounds is shown to be optimal and independent of the underlying dimension, which appears only in the constants. Using a simple modification of the subgradient descent algorithm of Cuturi and Doucet, we showcase the applicability of our method on a myriad of simulated datasets, as well as a real-data example which are out of reach for state of the art algorithms for computing Wasserstein barycenters.
翻译:我们建议一种混合再采样方法,在大型数据集中大致使用有限支持的瓦森斯坦(Wasserstein)百分点,该方法可以与任何精确的求解器结合起来。 目标值预期错误的非抽调边框以及百居器本身可以校准计算成本和统计准确性。 这些上边框的速率被证明是最佳的,并且与基本维度无关,后者只出现在常数中。 我们简单地修改Cuturi和Doucet的次梯位下位下位运算法,我们展示了我们的方法在无数模拟数据集中的可适用性,并展示了一个实际数据示例,而计算瓦森(Wasserstein)百居器的艺术算法状态则无法触及。