Comparing probability distributions is at the crux of many machine learning algorithms. Maximum Mean Discrepancies (MMD) and Optimal Transport distances (OT) are two classes of distances between probability measures that have attracted abundant attention in past years. This paper establishes some conditions under which the Wasserstein distance can be controlled by MMD norms. Our work is motivated by the compressive statistical learning (CSL) theory, a general framework for resource-efficient large scale learning in which the training data is summarized in a single vector (called sketch) that captures the information relevant to the considered learning task. Inspired by existing results in CSL, we introduce the H\"older Lower Restricted Isometric Property (H\"older LRIP) and show that this property comes with interesting guarantees for compressive statistical learning. Based on the relations between the MMD and the Wasserstein distance, we provide guarantees for compressive statistical learning by introducing and studying the concept of Wasserstein learnability of the learning task, that is when some task-specific metric between probability distributions can be bounded by a Wasserstein distance.
翻译:比较概率分布是许多机器学习算法的关键所在。 最大平均差异和最佳迁移距离是过去几年来引起大量注意的概率测量方法之间的两类距离。 本文确定了瓦塞斯坦距离可以由MMD规范控制的一些条件。 我们的工作受压缩统计学习理论( CSL) 的驱动, 这个理论是资源节约型大规模学习的一般框架, 将培训数据汇总于一个单一矢量( 所谓的草图), 收集与考虑的学习任务有关的信息。 在CPL 的现有结果的启发下, 我们引入了 H\ older LRIP, 并显示该属性具有压缩统计学习的有趣保障。 基于MMD 和 Wasserstein 距离之间的关系, 我们通过引入和研究瓦塞斯坦学习任务的可学习性概念, 为压缩统计学习学习提供保证, 也就是在瓦塞斯坦距离可以约束某些特定任务概率分布指标时。