Training on synthetic data can be beneficial for label or data-scarce scenarios. However, synthetically trained models often suffer from poor generalization in real domains due to domain gaps. In this work, we make a key observation that the diversity of the learned feature embeddings plays an important role in the generalization performance. To this end, we propose contrastive synthetic-to-real generalization (CSG), a novel framework that leverages the pre-trained ImageNet knowledge to prevent overfitting to the synthetic domain, while promoting the diversity of feature embeddings as an inductive bias to improve generalization. In addition, we enhance the proposed CSG framework with attentional pooling (A-pool) to let the model focus on semantically important regions and further improve its generalization. We demonstrate the effectiveness of CSG on various synthetic training tasks, exhibiting state-of-the-art performance on zero-shot domain generalization.
翻译:合成数据培训可能有利于标签或数据残缺的假设情景,然而,由于领域差距,经过合成培训的模型在实际领域往往缺乏一般化;在这项工作中,我们提出一个关键意见,即所学特征嵌入的多样性在一般化业绩中起着重要作用;为此,我们提出具有对比性的合成到现实的概括化(CSG),这是一个新颖的框架,利用经过培训的图像网络知识防止过度适应合成领域,同时促进特征嵌入的多样性,将其作为一种诱导性的偏见,以改善一般化。此外,我们通过集中关注(A-pool)来加强拟议的CSG框架,使模型侧重于具有关键意义的区域,并进一步改进其一般化。我们展示了CSG在各种合成培训任务上的效力,展示了零发域常规化方面的最新业绩。