Optimal transport (OT) theory provides powerful tools to compare probability measures. However, OT is limited to nonnegative measures having the same mass, and suffers serious drawbacks about its computation and statistics. This leads to several proposals of regularized variants of OT in the recent literature. In this work, we consider an \textit{entropy partial transport} (EPT) problem for nonnegative measures on a tree having different masses. The EPT is shown to be equivalent to a standard complete OT problem on a one-node extended tree. We derive its dual formulation, then leverage this to propose a novel regularization for EPT which admits fast computation and negative definiteness. To our knowledge, the proposed regularized EPT is the first approach that yields a \textit{closed-form} solution among available variants of unbalanced OT. For practical applications without priori knowledge about the tree structure for measures, we propose tree-sliced variants of the regularized EPT, computed by averaging the regularized EPT between these measures using random tree metrics, built adaptively from support data points. Exploiting the negative definiteness of our regularized EPT, we introduce a positive definite kernel, and evaluate it against other baselines on benchmark tasks such as document classification with word embedding and topological data analysis. In addition, we empirically demonstrate that our regularization also provides effective approximations.
翻译:最佳运输(OT) 理论为比较概率度量提供了有力的工具。 但是, OT仅限于非否定性措施,其质量相同,在计算和统计方面有严重缺陷。 这导致最近文献中出现了若干关于OT正规化变体的建议。 在这项工作中,我们考虑的是具有不同质量的树上非否定性措施问题。 EPT 被证明相当于在一节扩展的树上标准完整的OT问题。 我们用其双重配方,然后利用这一配方提出欧洲防止酷刑组织的新型正规化,承认快速计算和否定的确定性。 根据我们的知识,拟议的欧洲防止酷刑组织正规化变种方案是产生不平衡 OT现有变体的正规化变体的第一个办法。 关于没有事先了解树结构作为衡量标准的实际应用,我们提出了正规化的欧洲防止酷刑组织变种变种,我们用随机的树标度来计算,从支持数据点中建立适应性地调整了欧洲防止酷刑组织。 将常规化欧洲防止酷刑组织(EPT)的固定的分类作为我们固定的基底级文件的精确地展示。