Recent work reported the label alignment property in a supervised learning setting: the vector of all labels in the dataset is mostly in the span of the top few singular vectors of the data matrix. Inspired by this observation, we derive a regularization method for unsupervised domain adaptation. Instead of regularizing representation learning as done by popular domain adaptation methods, we regularize the classifier so that the target domain predictions can to some extent ``align" with the top singular vectors of the unsupervised data matrix from the target domain. In a linear regression setting, we theoretically justify the label alignment property and characterize the optimality of the solution of our regularization by bounding its distance to the optimal solution. We conduct experiments to show that our method can work well on the label shift problems, where classic domain adaptation methods are known to fail. We also report mild improvement over domain adaptation baselines on a set of commonly seen MNIST-USPS domain adaptation tasks and on cross-lingual sentiment analysis tasks.
翻译:最近的工作在受监督的学习环境中报告了标签调整属性:数据集中所有标签的矢量大多位于数据矩阵中最少数单一矢量的范围。受此观察的启发,我们为不受监督的域适应工作制定了一种正规化方法。我们没有像流行域适应方法那样将代表性学习正规化,而是对分类器进行正规化,以便目标域预测可以在某种程度上与目标域未监督的数据矩阵中最单一矢量“相匹配”。在线性回归设置中,我们理论上证明标签调整属性是合理的,并通过将它与最佳解决方案的距离相连接来描述我们正规化解决方案的最佳性。我们进行实验,以表明我们的方法在标签转换问题上能够很好地发挥作用,因为人们知道传统的域适应方法会失败。我们还报告在一组常见的MNIST-USPS域适应任务和跨语言感官分析任务上对领域适应基线稍有改进。