Mainstream approaches for unsupervised domain adaptation (UDA) learn domain-invariant representations to bridge domain gap. More recently, self-training has been gaining momentum in UDA. Originated from semi-supervised learning, self-training uses unlabeled data efficiently by training on pseudo-labels. However, as corroborated in this work, under distributional shift in UDA, the pseudo-labels can be unreliable in terms of their large discrepancy from the ground truth labels. Thereby, we propose Cycle Self-Training (CST), a principled self-training algorithm that enforces pseudo-labels to generalize across domains. In the forward step, CST generates target pseudo-labels with a source-trained classifier. In the reverse step, CST trains a target classifier using target pseudo-labels, and then updates the shared representations to make the target classifier perform well on the source data. We introduce the Tsallis entropy, a novel regularization to improve the quality of target pseudo-labels. On quadratic neural networks, we prove that CST recovers target ground truth, while both invariant feature learning and vanilla self-training fail. Empirical results indicate that CST significantly improves over prior state-of-the-arts in standard UDA benchmarks across visual recognition and sentiment analysis tasks.
翻译:不受监督的域适应(UDA)主流方法(UDA)学习域内差异表示以弥合域间差距。最近,UDA的自我培训势头日益增强。来自半监督的学习,自我培训通过假标签培训有效使用无标签数据。然而,正如这项工作所证实的,在UDA的分布式转换下,假标签在与地面真实标签的巨大差异方面不可靠。我们因此建议循环自我培训(CST),这是一套有原则的自我培训算法,可强制将假标签推广到各个域间。在前进的一步中,科技委与来源培训的分类师生成了目标假标签。在反向步骤中,科技委培训了使用目标假标签的目标分类师,然后更新了共同表述方法,以使目标分类在源数据上表现良好。我们引入了Tsallis entropy,这是一种新型的正规化,以提高目标假标签的质量。在二次神经网络中,我们证明科技委恢复了目标地面事实。同时,在UFI-DA标准测试中大大改进了对U-A级标准的自我测试结果。