While the optimal transport (OT) problem was originally formulated as a linear program, the addition of entropic regularization has proven beneficial both computationally and statistically, for many applications. The Sinkhorn fixed-point algorithm is the most popular approach to solve this regularized problem, and, as a result, multiple attempts have been made to reduce its runtime using, e.g., annealing in the regularization parameter, momentum or acceleration. The premise of this work is that initialization of the Sinkhorn algorithm has received comparatively little attention, possibly due to two preconceptions: since the regularized OT problem is convex, it may not be worth crafting a good initialization, since any is guaranteed to work; secondly, because the outputs of the Sinkhorn algorithm are often unrolled in end-to-end pipelines, a data-dependent initialization would bias Jacobian computations. We challenge this conventional wisdom, and show that data-dependent initializers result in dramatic speed-ups, with no effect on differentiability as long as implicit differentiation is used. Our initializations rely on closed-forms for exact or approximate OT solutions that are known in the 1D, Gaussian or GMM settings. They can be used with minimal tuning, and result in consistent speed-ups for a wide variety of OT problems.
翻译:----
虽然最优输运问题最初被制定为线性规划,但加入熵正则化已被证明在许多应用中具有计算和统计上的优势。Sinkhorn不动点算法是解决这个正则化问题的最流行方法。因此,已经尝试多种方法通过在正则化参数中使用退火、动量或加速来减少其运行时间。这项工作的前提是Sinkhorn算法的初始化相对较少被关注,可能由于两种偏见:因为正则化的OT问题是凸的,所以可能不值得制定一个好的初始化,因为任何一个都是有保证的。其次,因为Sinkhorn算法的输出通常在端到端的管道中展开,因此数据相关的初始化会导致Jacobian计算的偏见。我们挑战这种传统智慧,并显示出数据相关初始化会导致显著加速,在使用隐式微分时对可微性没有影响。我们的初始化依赖于一维、高斯或GMM设置中已知的精确或近似OT解的解析形式。它们可以使用最小的调整,并在各种OT问题中获得一致的加速。