Optimal transport distances (OT) have been widely used in recent work in Machine Learning as ways to compare probability distributions. These are costly to compute when the data lives in high dimension. Recent work by Paty et al., 2019, aims specifically at reducing this cost by computing OT using low-rank projections of the data (seen as discrete measures). We extend this approach and show that one can approximate OT distances by using more general families of maps provided they are 1-Lipschitz. The best estimate is obtained by maximising OT over the given family. As OT calculations are done after mapping data to a lower dimensional space, our method scales well with the original data dimension. We demonstrate the idea with neural networks.
翻译:最佳迁移距离(OT)在最近机器学习工作中被广泛使用,作为比较概率分布的方法。当数据处于高维时,计算成本很高。Paty等人(2019年)最近的工作特别旨在降低这一成本,方法是利用数据低层次的预测计算OT(作为离散措施)。我们扩展了这一方法,并表明如果使用更普通的地图序列,只要使用1-Lipschitz,就可以近似OT距离。最佳估计是通过将OT最大化于给定家庭获得的。正如OT计算是在将数据绘制到一个较低维度的空间后完成的,我们的方法尺度与原始数据维度相当。我们用神经网络展示了这个想法。