Optimal transport (OT) theory has been been used in machine learning to study and characterize maps that can push-forward efficiently a probability measure onto another. Recent works have drawn inspiration from Brenier's theorem, which states that when the ground cost is the squared-Euclidean distance, the ``best'' map to morph a continuous measure in $\mathcal{P}(\Rd)$ into another must be the gradient of a convex function. To exploit that result, [Makkuva+ 2020, Korotin+2020] consider maps $T=\nabla f_\theta$, where $f_\theta$ is an input convex neural network (ICNN), as defined by Amos+2017, and fit $\theta$ with SGD using samples. Despite their mathematical elegance, fitting OT maps with ICNNs raises many challenges, due notably to the many constraints imposed on $\theta$; the need to approximate the conjugate of $f_\theta$; or the limitation that they only work for the squared-Euclidean cost. More generally, we question the relevance of using Brenier's result, which only applies to densities, to constrain the architecture of candidate maps fitted on samples. Motivated by these limitations, we propose a radically different approach to estimating OT maps: Given a cost $c$ and a reference measure $\rho$, we introduce a regularizer, the Monge gap $\mathcal{M}^c_{\rho}(T)$ of a map $T$. That gap quantifies how far a map $T$ deviates from the ideal properties we expect from a $c$-OT map. In practice, we drop all architecture requirements for $T$ and simply minimize a distance (e.g., the Sinkhorn divergence) between $T\sharp\mu$ and $\nu$, regularized by $\mathcal{M}^c_\rho(T)$. We study $\mathcal{M}^c_{\rho}$, and show how our simple pipeline outperforms significantly other baselines in practice.
翻译:优化运输( OT) 理论已被用于机器学习, 研究和描述能够将概率测量有效推向另一个目标的地图。 最近的工作从Brenier 的理论中得到启发, 该理论指出, 当地面成本为平方- 欧几里德距离时, “ 最佳” 地图将持续测量值在$mathcal{P} (\Rd) 中进行, 必须是 convex 函数的梯度。 要利用这一结果, [Makkuva+ 2020, Korotin+ 220] 考虑绘制 $( 美元) $( Tnabla f ⁇ theta$ ) 的地图。 其中, $( 美元) 定期引入一个输入convex 神经网络( ICNNN), 由 Amos +2017 定义, 将 $( 美元) 美元与 SGDD 相匹配。 尽管它们的数学优度, 将OT 地图与 ICN 值匹配, 但对于 $( $( 美元) 美元) 和 美元( 美元) 美元( 美元) 美元) 直径解算(美元) 直径解算) 成本( 和 直径解算) 的地图( 成本( 成本( ) 成本( ) 直径) 和 直径方) 直径方) 根基) 根基) 问题,,,, 直径方( 我们算) 至一个结果( 直方( 直方) 至一个( 直方) 直径方( 直方) 直方) 直方( 直方) 直方( 直方) 直方) 直方) 直方) 直方) 直方) 直方) 直方( 根和( 直方) 直方(我们方) 直方) 根基) 直方) 根和直方( 直方) 直方) 直方) 根和直方(直方(直方) 根和直方) 直方) 直方) 根和直方( 直方( 根方) 根和直方)