Most existing research on domain generalization assumes source data gathered from multiple domains are fully annotated. However, in real-world applications, we might have only a few labels available from each source domain due to high annotation cost, along with abundant unlabeled data that are much easier to obtain. In this work, we investigate semi-supervised domain generalization (SSDG), a more realistic and practical setting. Our proposed approach, StyleMatch, is inspired by FixMatch, a state-of-the-art semi-supervised learning method based on pseudo-labeling, with several new ingredients tailored to solve SSDG. Specifically, 1) to mitigate overfitting in the scarce labeled source data while improving robustness against noisy pseudo labels, we introduce stochastic modeling to the classifier's weights, seen as class prototypes, with Gaussian distributions. 2) To enhance generalization under domain shift, we upgrade FixMatch's two-view consistency learning paradigm based on weak and strong augmentations to a multi-view version with style augmentation as the third complementary view. To provide a comprehensive study and evaluation, we establish two SSDG benchmarks, which cover a wide range of strong baseline methods developed in relevant areas including domain generalization and semi-supervised learning. Extensive experiments demonstrate that StyleMatch achieves the best out-of-distribution generalization performance in the low-data regime. We hope our approach and benchmarks can pave the way for future research on data-efficient and generalizable learning systems.
翻译:大多数关于域通用化的现有研究都假定,从多个领域收集的源数据是完全附加说明的。然而,在现实世界应用中,由于注释成本高,我们可能从每个来源领域只有几个标签,加上大量不贴标签的数据更容易获得。在这项工作中,我们调查半监督的域通用化(SSDG),这是一个更现实、更实际的设置。我们建议的方法,SsteleMatch,受基于假标签的最先进的半监督的学习方法FixMatch的启发,这个方法基于假标签,并配有若干新的成份,专门用于解决SSDG。具体来说,1)为了减少标签稀有的源数据中的过度匹配,同时改进对噪音伪标签标签标签标签标签的坚固度,以及大量无标签的数据,我们采用随机模型模型,作为分类器的原型模型,用高山分发。(2)为了在域转换下加强通用的通用化,我们把SixMatch的双视图一致性学习模式升级为多视图版本,作为第三次补充观点。为了提供全面研究和升级,我们为未来最可靠的标准,我们为获得最佳的SDDDG数据库,我们制定了最可靠的通用的通用基准,我们建立了最可靠的基础化的模型。我们建立和最可靠的基础,在高层次化的研究和最可靠的基础,我们建立了基础,我们建立了最先进的标准,在高的SISDGDGDGBBBBBB的模型。我们建立了最相关的基础,用来的模型。我们建立了了最可靠的基础,在高的系统,在最精确的模型,在进行最精确的模型,在进行最精确的学习。