The conventional sliced Wasserstein is defined between two probability measures that have realizations as vectors. When comparing two probability measures over images, practitioners first need to vectorize images and then project them to one-dimensional space by using matrix multiplication between the sample matrix and the projection matrix. After that, the sliced Wasserstein is evaluated by averaging the two corresponding one-dimensional projected probability measures. However, this approach has two limitations. The first limitation is that the spatial structure of images is not captured efficiently by the vectorization step; therefore, the later slicing process becomes harder to gather the discrepancy information. The second limitation is memory inefficiency since each slicing direction is a vector that has the same dimension as the images. To address these limitations, we propose novel slicing methods for sliced Wasserstein between probability measures over images that are based on the convolution operators. We derive convolution sliced Wasserstein (CSW) and its variants via incorporating stride, dilation, and non-linear activation function into the convolution operators. We investigate the metricity of CSW as well as its sample complexity, its computational complexity, and its connection to conventional sliced Wasserstein distances. Finally, we demonstrate the favorable performance of CSW over the conventional sliced Wasserstein in comparing probability measures over images and in training deep generative modeling on images.
翻译:常规的切片瓦西斯坦被定义为具有矢量的两种概率度量。 当比较图像的两种概率度量时, 实践者首先需要将图像矢量化, 然后通过使用样本矩阵和投影矩阵之间的矩阵乘法将图像投射到一维空间。 之后, 切片瓦西斯坦通过平均两种相应的一维预测概率度量来评估。 但是, 这种方法有两个局限性。 第一个局限性是图像的空间结构没有通过矢量化步骤有效捕捉; 因此, 较晚的切片过程更难收集差异信息。 第二个局限性是: 存储效率低, 因为每个切片方向都是一个具有与图像相同维度的矢量。 为了解决这些局限性, 我们提出了新的切片瓦塞斯坦方法, 介于基于卷动操作者的图像的概率度量之间。 我们通过将缩放、 缩放和非线性活化功能纳入进化操作者。 我们调查了妇女地位委员会的度度度, 以及其样本复杂性、 其计算复杂性、 计算复杂度、 其偏向性、 最终将常规石质化图像与常规的精确度比 。