快速固定化分异优化传输,配有群分调节器</s> (Fast Regularized Discrete Optimal Transport with Group-Sparse Regularizers)

Regularized discrete optimal transport (OT) is a powerful tool to measure the distance between two discrete distributions that have been constructed from data samples on two different domains. While it has a wide range of applications in machine learning, in some cases the sampled data from only one of the domains will have class labels such as unsupervised domain adaptation. In this kind of problem setting, a group-sparse regularizer is frequently leveraged as a regularization term to handle class labels. In particular, it can preserve the label structure on the data samples by corresponding the data samples with the same class label to one group-sparse regularization term. As a result, we can measure the distance while utilizing label information by solving the regularized optimization problem with gradient-based algorithms. However, the gradient computation is expensive when the number of classes or data samples is large because the number of regularization terms and their respective sizes also turn out to be large. This paper proposes fast discrete OT with group-sparse regularizers. Our method is based on two ideas. The first is to safely skip the computations of the gradients that must be zero. The second is to efficiently extract the gradients that are expected to be nonzero. Our method is guaranteed to return the same value of the objective function as that of the original method. Experiments show that our method is up to 8.6 times faster than the original method without degrading accuracy.

翻译：常规离散最佳运输( OT) 是测量两个不同领域的数据样本所建数据样本所建两个离散分布分布之间的距离的有力工具。虽然它在机器学习中应用了各种各样的应用, 但在某些情况下, 仅其中一个领域的抽样数据将具有类标签, 如不受监督的域适应。在这类问题设置中, 群体偏差的常规化器经常作为处理类标签的正规化术语被利用。特别是, 它可以通过将数据样本与同一类标签匹配到一个组偏斜的正规化术语来维护数据样本上的标签结构。因此, 我们可以在使用标签信息的同时, 通过使用基于渐变的算法解决常规化优化问题来测量距离。然而, 当类别或数据样本数量大时, 将使用类或数据样本的分类标签标签标签标签标签标签标签标签标签标签标签标签标签标签标签标签标签标签标签标签标签标签标签标签。。在这类问题设置中, 群体偏差的常规化常规化的常规化的常规化器往往被作为处理类别标签标签标签标签标签标签标签的正规化术语。我们的方法基于两个想法。首先是安全地跳过梯度的梯度的计算, 。因此, 标签信息信息信息信息信息信息可以使用, 使用比原序法的精确化的计算方法的精确化的精确性地计算方法的精确性, 。第二个的精确性地分析法是: 我们的精确度的精确度的精确度的精确度是, 。</s>