Optimal transport (OT) is a popular tool in machine learning to compare probability measures geometrically, but it comes with substantial computational burden. Linear programming algorithms for computing OT distances scale cubically in the size of the input, making OT impractical in the large-sample regime. We introduce a practical algorithm, which relies on a quantization step, to estimate OT distances between measures given cheap sample access. We also provide a variant of our algorithm to improve the performance of approximate solvers, focusing on those for entropy-regularized transport. We give theoretical guarantees on the benefits of this quantization step and display experiments showing that it behaves well in practice, providing a practical approximation algorithm that can be used as a drop-in replacement for existing OT estimators.
翻译:最佳运输(OT)是机器学习以比较几何测量概率的常用工具,但也有相当大的计算负担。 以输入量的大小分线计算OT距离比例的线性编程算法,使大抽样制度中OT不切实际。 我们引入了一种实际算法,依靠一个量化步骤来估计OT在措施之间的距离,以获得廉价的样本访问。 我们还提供了一种算法的变种,以改善近似解答器的性能,重点是对正态运输的解算法。 我们从理论上保证了这一量化步骤的好处,并展示了实验,表明它在实践中表现良好,提供了一种实用的近似算法,可以用作现有OT估计器的低位替代物。