Though convolutional neural networks (CNNs) have demonstrated remarkable ability in learning discriminative features, they often generalize poorly to unseen domains. Domain generalization aims to address this problem by learning from a set of source domains a model that is generalizable to any unseen domain. In this paper, a novel approach is proposed based on probabilistically mixing instance-level feature statistics of training samples across source domains. Our method, termed MixStyle, is motivated by the observation that visual domain is closely related to image style (e.g., photo vs.~sketch images). Such style information is captured by the bottom layers of a CNN where our proposed style-mixing takes place. Mixing styles of training instances results in novel domains being synthesized implicitly, which increase the domain diversity of the source domains, and hence the generalizability of the trained model. MixStyle fits into mini-batch training perfectly and is extremely easy to implement. The effectiveness of MixStyle is demonstrated on a wide range of tasks including category classification, instance retrieval and reinforcement learning.
翻译:虽然连锁神经网络(CNNs)在学习歧视特征方面表现出了非凡的能力,但它们往往向隐蔽领域不甚普及。 广域化的目的是通过从一组源域中学习一个可普遍推广到任何不可见领域的模型来解决这一问题。 在本文中,根据不同源域培训样本的概率混合实例级特征统计,提出了一种新颖的方法。 我们的方法称为MixStyle,其动机是观察视觉域与图像风格(例如,照片与~Strach图像)密切相关。 这种风格信息被CNN的底层层所捕捉,我们提议的样式混合是在那里的。 混合培训实例的风格方式导致新颖域被隐含地合成,这增加了源域的域多样性,从而增加了经过培训的模型的通用性。 MixStyle 完美地融入了小型批量培训,并且非常容易实施。 MixStyle的功效表现在一系列广泛的任务上,包括分类、实例检索和强化学习。