Variational Inference approximates an unnormalized distribution via the minimization of Kullback-Leibler (KL) divergence. Although this divergence is efficient for computation and has been widely used in applications, it suffers from some unreasonable properties. For example, it is not a proper metric, i.e., it is non-symmetric and does not preserve the triangle inequality. On the other hand, optimal transport distances recently have shown some advantages over KL divergence. With the help of these advantages, we propose a new variational inference method by minimizing sliced Wasserstein distance, a valid metric arising from optimal transport. This sliced Wasserstein distance can be approximated simply by running MCMC but without solving any optimization problem. Our approximation also does not require a tractable density function of variational distributions so that approximating families can be amortized by generators like neural networks. Furthermore, we provide an analysis of the theoretical properties of our method. Experiments on synthetic and real data are illustrated to show the performance of the proposed method.
翻译:通过尽量减少Kullback-Leiber (KL) 差异,变化性推断近似于一种未经正常分配的分布。虽然这种差异对于计算有效,并且在应用中被广泛使用,但它也存在一些不合理的特性。例如,它不是一个适当的衡量标准,即非对称性,无法保持三角间的不平等。另一方面,最近,最佳的运输距离比KL差异显示出一些优势。在这些优势的帮助下,我们提出了一种新的变异推论方法,将切片瓦塞尔斯坦距离最小化,这是最佳运输的一种有效衡量标准。这一切片瓦西尔斯坦距离可以简单地通过运行 MMC 来估计,但不能解决任何优化问题。我们的近距离也不要求变化分布具有可移动的密度功能,使相近的家庭能够被像神经网络这样的发电机分解。此外,我们分析了我们方法的理论特性。对合成数据和真实数据的实验展示了拟议方法的性能。