The power and flexibility of Optimal Transport (OT) have pervaded a wide spectrum of problems, including recent Machine Learning challenges such as unsupervised domain adaptation. Its essence of quantitatively relating two probability distributions by some optimal metric, has been creatively exploited and shown to hold promise for many real-world data challenges. In a related theme in the present work, we posit that domain adaptation robustness is rooted in the intrinsic (latent) representations of the respective data, which are inherently lying in a non-linear submanifold embedded in a higher dimensional Euclidean space. We account for the geometric properties by refining the $l^2$ Euclidean metric to better reflect the geodesic distance between two distinct representations. We integrate a metric correction term as well as a prior cluster structure in the source data of the OT-driven adaptation. We show that this is tantamount to an implicit Bayesian framework, which we demonstrate to be viable for a more robust and better-performing approach to domain adaptation. Substantiating experiments are also included for validation purposes.
翻译:最优传输(OT)的功效和灵活性已渗透到广泛的领域,包括最近的机器学习挑战,例如无监督领域适应。它通过一些最优度量定量地关联两个概率分布的本质,已经被创造性地利用并显示出在许多现实世界的数据挑战中具有希望。在本文的相关主题中,我们认为适应领域的鲁棒性根植于各自数据中固有的(潜在)表示形式,它们固有地位于一个嵌入在高维欧几里得空间中的非线性子流形中。我们通过细化$l^2$欧氏度量来考虑几何性质,以更好地反映两个不同表示之间的测地距离。我们将度量修正项及先验集群结构集成到OT驱动的适应性中。我们表明这等同于一种隐式贝叶斯框架,我们将其证明为更鲁棒且性能更好的领域适应方法。包括实验验证在内的证明也包括在内。