We propose a hybrid resampling method to approximate finitely supported Wasserstein barycenters on large-scale datasets, which can be combined with any exact solver. Nonasymptotic bounds on the expected error of the objective value as well as the barycenters themselves allow to calibrate computational cost and statistical accuracy. The rate of these upper bounds is shown to be optimal and independent of the underlying dimension, which appears only in the constants. Using a simple modification of the subgradient descent algorithm of Cuturi and Doucet, we showcase the applicability of our method on a myriad of simulated datasets, as well as a real-data example from cell microscopy which are out of reach for state of the art algorithms for computing Wasserstein barycenters.
翻译:我们建议一种混合再抽样方法,在大型数据集中大致使用有限支持的瓦森斯坦(Wasserstein)百分点,该方法可以与任何精确的求解器结合起来。 目标值预期错误的非抽调界限以及百居器本身都允许校准计算成本和统计准确性。 这些上限的速率被证明是最佳的,且与基本维度无关,后者只出现在常数中。 我们使用对Cuturi 和 Doucet 的次梯位下位下位运算法的简单修改,展示了我们的方法在无数模拟数据集中的可适用性,以及细胞显微镜中的真实数据实例,而计算瓦塞斯坦(Wasserstein)采样的艺术算法的状态无法达到。