Discriminating between distributions is an important problem in a number of scientific fields. This motivated the introduction of Linear Optimal Transportation (LOT), which embeds the space of distributions into an $L^2$-space. The transform is defined by computing the optimal transport of each distribution to a fixed reference distribution, and has a number of benefits when it comes to speed of computation and to determining classification boundaries. In this paper, we characterize a number of settings in which LOT embeds families of distributions into a space in which they are linearly separable. This is true in arbitrary dimension, and for families of distributions generated through perturbations of shifts and scalings of a fixed distribution.We also prove conditions under which the $L^2$ distance of the LOT embedding between two distributions in arbitrary dimension is nearly isometric to Wasserstein-2 distance between those distributions. This is of significant computational benefit, as one must only compute $N$ optimal transport maps to define the $N^2$ pairwise distances between $N$ distributions. We demonstrate the benefits of LOT on a number of distribution classification problems.
翻译:分布分布是若干科学领域的一个重要问题。 这促使引入线性最佳交通(LOT), 将分布空间嵌入一个$L$2美元的空间。 变换的定义是计算每个分布的最佳迁移到固定的参考分布, 在计算速度和确定分类界限方面有若干好处。 在本文中, 我们给出了LOT将分布家庭嵌入一个其线性分离的空间的若干环境。 这在任意的尺寸和通过移动干扰和固定分布的缩放产生的分布家庭都是如此。 我们还证明了在任意尺寸两种分布之间嵌入LOT的距离接近瓦塞斯坦-2距离的条件。 这在计算上有很大的好处, 因为只有计算美元的最佳运输地图才能确定美元分布之间的双向距离。 我们展示了LOT在一些分布分类问题上的好处。