Mainstream approaches for unsupervised domain adaptation (UDA) learn domain-invariant representations to narrow the domain shift. Recently, self-training has been gaining momentum in UDA, which exploits unlabeled target data by training with target pseudo-labels. However, as corroborated in this work, under distributional shift in UDA, the pseudo-labels can be unreliable in terms of their large discrepancy from target ground truth. Thereby, we propose Cycle Self-Training (CST), a principled self-training algorithm that explicitly enforces pseudo-labels to generalize across domains. CST cycles between a forward step and a reverse step until convergence. In the forward step, CST generates target pseudo-labels with a source-trained classifier. In the reverse step, CST trains a target classifier using target pseudo-labels, and then updates the shared representations to make the target classifier perform well on the source data. We introduce the Tsallis entropy as a confidence-friendly regularization to improve the quality of target pseudo-labels. We analyze CST theoretically under realistic assumptions, and provide hard cases where CST recovers target ground truth, while both invariant feature learning and vanilla self-training fail. Empirical results indicate that CST significantly improves over the state-of-the-arts on visual recognition and sentiment analysis benchmarks.
翻译:未经监督的域适应(UDA)的主要方针是学习域内差异的域内变化,以缩小域变换。最近,UDA的自我培训势头不断增强,通过使用目标假标签培训开发未贴标签的目标数据。然而,正如这项工作所证实的,在UDA的分布式转换中,假标签从与目标地面真相的巨大差异来看可能不可靠。因此,我们提出循环自我培训(CST),这是一套原则性自我培训算法,明确执行假标签,以便在各域间推广。科技委的周期是前进一步,反向一步,直到趋同。在前进的一步中,科技委用源培训的分类员制作了目标假标签。在反向步骤中,科技委用目标分类员培训了一个目标分类员,使用目标假标签,然后更新了共同表述表,使目标分类员在源数据上表现良好。我们提出“Tsalliis entropy ”,作为信任性的正规化,以提高目标伪标签的质量。我们在现实假设下从理论上分析科技委的假设下,并提供硬案例,在科技委恢复目标地面地面事实分析结果时,同时大幅改进了“图像”的自我分析。