Barycenters (aka Fr\'echet means) were introduced in statistics in the 1940's and popularized in the fields of shape statistics and, later, in optimal transport and matrix analysis. They provide the most natural extension of linear averaging to non-Euclidean geometries, which is perhaps the most basic and widely used tool in data science. In various setups, their asymptotic properties, such as laws of large numbers and central limit theorems, have been established, but their non-asymptotic behaviour is still not well understood. In this work, we prove finite sample concentration inequalities (namely, generalizations of Hoeffding's and Bernstein's inequalities) for barycenters of i.i.d. random variables in metric spaces with non-positive curvature in Alexandrov's sense. As a byproduct, we also obtain PAC guarantees for a stochastic online algorithm that computes the barycenter of a finite collection of points in a non-positively curved space. We also discuss extensions of our results to spaces with possibly positive curvature.
翻译:1940年代的统计中引入了Barycenters(aka Fr\'echet ), 并在形状统计领域以及后来的最佳交通和矩阵分析中被普及。 它们为非欧洲大陆地貌提供了最自然的线性平均延伸, 可能是数据科学中最基本和最广泛使用的工具。 在各种设置中, 它们的无症状特性, 如大量法和中央限值理论, 已经建立起来, 但其非被动行为仍然不为人所熟知。 在这项工作中, 我们证明对亚历山德罗夫认为非积极的曲线的多指标空间随机变量中, 具有有限的样本浓度( 即Hoffding 和 Bernstein 不平等的概括性) 的浓度不平等性( ) 。 我们还讨论将我们的结果扩展到可能具有正面曲线的空间 。</s>