We introduce weak barycenters of a family of probability distributions, based on the recently developed notion of optimal weak transport of mass by Gozlanet al. (2017) and Backhoff-Veraguas et al. (2020). We provide a theoretical analysis of this object and discuss its interpretation in the light of convex ordering between probability measures. In particular, we show that, rather than averaging the input distributions in a geometric way (as the Wasserstein barycenter based on classic optimal transport does) weak barycenters extract common geometric information shared by all the input distributions, encoded as a latent random variable that underlies all of them. We also provide an iterative algorithm to compute a weak barycenter for a finite family of input distributions, and a stochastic algorithm that computes them for arbitrary populations of laws. The latter approach is particularly well suited for the streaming setting, i.e., when distributions are observed sequentially. The notion of weak barycenter and our approaches to compute it are illustrated on synthetic examples, validated on 2D real-world data and compared to standard Wasserstein barycenters.
翻译:我们根据Gozlanet al.(2017年)和Backhoff-Veraguas等人(202020年)最近形成的最弱质量运输弱的概念,引入了概率分布大家庭的薄弱中继器;我们对这一对象进行理论分析,并根据概率测量之间的曲线顺序,讨论其解释;特别是,我们表明,与其以几何方式平均输入分布(如瓦塞斯坦根据经典最佳运输法建立的瓦塞斯坦中继器),不如弱中继器提取所有输入分布共享的通用几何信息,并编码为所有输入分布的潜在随机变量;我们还提供迭代算法,用以计算一个微弱的输入分布核心,以及一种根据任意法律数量进行计算的方法;后一种方法特别适合以几何方式进行配置(例如,按顺序观测分布时)。弱中继器的概念和我们用来计算它的方法以合成示例为基础,在2D真实世界数据上加以验证,并与标准瓦塞斯坦中心比较。