Over the past few years, numerous computational models have been developed to solve Optimal Transport (OT) in a stochastic setting, where distributions are represented by samples. In such situations, the goal is to find a transport map that has good generalization properties on unseen data, ideally the closest map to the ground truth, unknown in practical settings. However, in the absence of ground truth, no quantitative criterion has been put forward to measure its generalization performance although it is crucial for model selection. We propose to leverage the Brenier formulation of OT to perform this task. Theoretically, we show that this formulation guarantees that, up to a distortion parameter that depends on the smoothness/strong convexity and a statistical deviation term, the selected map achieves the lowest quadratic error to the ground truth. This criterion, estimated via convex optimization, enables parameter and model selection among entropic regularization of OT, input convex neural networks and smooth and strongly convex nearest-Brenier (SSNB) models. Last, we make an experiment questioning the use of OT in Domain-Adaptation. Thanks to the criterion, we can identify the potential that is closest to the true OT map between the source and the target and we observe that this selected potential is not the one that performs best for the downstream transfer classification task.
翻译:在过去几年里,已经开发了许多计算模型,在样品代表分布分布的随机环境中解决最佳运输(OT),在这种情形下,目标是找到一个在不可见数据上具有良好概括性特征的运输图,最好是最接近地面真相的地图,在实际环境中并不为人所知。然而,在缺乏地面真相的情况下,没有提出量化标准来测量其一般化性能,尽管对于模式选择至关重要。我们提议利用OT的布雷尼埃配方来完成这项任务。理论上,我们证明这种配方保证,在取决于平滑/强固凝固和统计偏差术语的扭曲参数之前,所选的地图能够达到地面真相的最低二次误差。这一标准,通过配置最优化估算,使得OT、输入锥形神经网络和光滑和强烈连接最接近Brenier(SSNB)模型之间能够设定参数和模型选择模型。最后,我们可以对Domain-Adexitation(SSNB)使用最佳扭曲参数进行实验,而我们所选的下游任务则是最接近的路径,我们能够确定最接近的路径,而最接近于最接近的路径。