Domain shift widely exists in the visual world, while modern deep neural networks commonly suffer from severe performance degradation under domain shift due to the poor generalization ability, which limits the real-world applications. The domain shift mainly lies in the limited source environmental variations and the large distribution gap between source and unseen target data. To this end, we propose a unified framework, Style-HAllucinated Dual consistEncy learning (SHADE), to handle such domain shift in various visual tasks. Specifically, SHADE is constructed based on two consistency constraints, Style Consistency (SC) and Retrospection Consistency (RC). SC enriches the source situations and encourages the model to learn consistent representation across style-diversified samples. RC leverages general visual knowledge to prevent the model from overfitting to source data and thus largely keeps the representation consistent between the source and general visual models. Furthermore, we present a novel style hallucination module (SHM) to generate style-diversified samples that are essential to consistency learning. SHM selects basis styles from the source distribution, enabling the model to dynamically generate diverse and realistic samples during training. Extensive experiments demonstrate that our versatile SHADE can significantly enhance the generalization in various visual recognition tasks, including image classification, semantic segmentation and object detection, with different models, i.e., ConvNets and Transformer.
翻译:视觉世界广泛存在域变,而现代深层神经网络通常由于一般化能力差,在域变状态下出现严重性能退化,限制了真实世界应用。域变主要在于来源环境差异有限,源与无形目标数据之间分布差距很大。为此,我们提议一个统一框架,即Style-Halluced 双元共融学习(SHADE),以处理各种视觉任务中的这种域变。具体地说,SHADE是根据两个一致性限制,即Style Consistency(SC)和Retrapstion Constantive(RC)来构建的。SCSHADE丰富源情况,鼓励模型学习不同风格多样化样本的一致代表性。RC利用一般视觉知识,防止模型过度适应源数据,从而在很大程度上保持源与一般视觉模型和一般视觉模型之间的一致。我们展示了一种新风格幻模模模模模模型(SHMHM)来生成对一致性学习至关重要的多样化样本。SHMMY从源分配中选择样式,使模型能够动态生成多样化和现实化的样本,在培训过程中进行。广泛化的图像分类实验显示,包括各种图像分析。