Domain generalization (DG) utilizes multiple labeled source datasets to train a generalizable model for unseen target domains. However, due to expensive annotation costs, the requirements of labeling all the source data are hard to be met in real-world applications. In this paper, we investigate a Single Labeled Domain Generalization (SLDG) task with only one source domain being labeled, which is more practical and challenging than the Conventional Domain Generalization (CDG). A major obstacle in the SLDG task is the discriminability-generalization bias: discriminative information in the labeled source dataset may contain domain-specific bias, constraining the generalization of the trained model. To tackle this challenging task, we propose a novel method called Domain-Specific Bias Filtering (DSBF), which initializes a discriminative model with the labeled source data and filters out its domain-specific bias with the unlabeled source data for generalization improvement. We divide the filtering process into: (1) Feature extractor debiasing using k-means clustering-based semantic feature re-extraction; and (2) Classifier calibrating using attention-guided semantic feature projection. DSBF unifies the exploration of the labeled and the unlabeled source data to enhance the discriminability and generalization of the trained model, resulting in a highly generalizable model. We further provide theoretical analysis to verify the proposed domain-specific bias filtering process. Extensive experiments on multiple datasets show the superior performance of DSBF in tackling both the challenging SLDG task and the CDG task.
翻译:域常规化 (DGD) 使用多标签源数据集来为隐性目标域训练通用模型。 但是,由于成本昂贵的注释成本, 标记所有源数据的要求很难在现实世界应用程序中达到。 在本文中, 我们调查一个单一标签标签化的域名通用(SLDG) 任务, 只有一个源域标签, 这比常规域通用化(CDG) 更加实用, 更具挑战性。 SLDG 任务中的一个主要障碍是扭曲性通用偏差: 标签源数据集中的歧视性信息可能包含特定域的偏差, 限制经过训练的模型的通用。 为了应对这一具有挑战性的任务, 我们提出了一个名为 Domain- 隐含比as 过滤(DSBFF) 的新方法, 它将一个带有标签化源码的歧视性模型, 并过滤其域性偏差, 用于改进常规化的常规源码。 我们将过滤程序分为:(1) 使用基于 k 量分组的精度的精度精度精度性精度分析, 解性精度校准的精度校正的精度校正SDLBIF 任务, 的精度校验的精度分析。