The ability to generalize learned representations across significantly different visual domains, such as between real photos, clipart, paintings, and sketches, is a fundamental capacity of the human visual system. In this paper, different from most cross-domain works that utilize some (or full) source domain supervision, we approach a relatively new and very practical Unsupervised Domain Generalization (UDG) setup of having no training supervision in neither source nor target domains. Our approach is based on self-supervised learning of a Bridge Across Domains (BrAD) - an auxiliary bridge domain accompanied by a set of semantics preserving visual (image-to-image) mappings to BrAD from each of the training domains. The BrAD and mappings to it are learned jointly (end-to-end) with a contrastive self-supervised representation model that semantically aligns each of the domains to its BrAD-projection, and hence implicitly drives all the domains (seen or unseen) to semantically align to each other. In this work, we show how using an edge-regularized BrAD our approach achieves significant gains across multiple benchmarks and a range of tasks, including UDG, Few-shot UDA, and unsupervised generalization across multi-domain datasets (including generalization to unseen domains and classes).
翻译:在实际照片、剪贴画、绘画和草图等不同视觉领域中,将学习到的演示内容概括化的能力是人类视觉系统的基本能力。在本文中,与使用某些(或完整)源域监督的多数跨域工程不同,我们采用相对新而非常实用的、不受监督的域通用(UDG) 设置,在源域或目标域中都没有培训监督。我们的方法是基于自我监督地学习横跨域的桥梁(BRAD) -- -- 一个辅助桥梁域,并配有一套从每个培训领域向BRAD保存视觉(图像到图像)映射到BRAD的语义学。BRAD和图谱与大多数(终端到终端)共同学习,采用对比性自我监督的自我监督演示模型,将每个域与其源域或目标域的测得都一致,从而隐含地将所有域(见于或看不见的)与其它域相协调。在这项工作中,我们展示了如何使用边缘的BRAD方法,在多个统域中,包括不甚甚高的常规的域域域域中,在多个一般数据级之间取得重要的成果。