Obtaining sufficient labelled data for model training is impractical for most real-life applications. Therefore, we address the problem of domain generalization for semantic segmentation tasks to reduce the need to acquire and label additional data. Recent work on domain generalization increase data diversity by varying domain-variant features such as colour, style and texture in images. However, excessive stylization or even uniform stylization may reduce performance. Performance reduction is especially pronounced for pixels from minority classes, which are already more challenging to classify compared to pixels from majority classes. Therefore, we introduce a module, $ASH_{+}$, that modulates stylization strength for each pixel depending on the pixel's semantic content. In this work, we also introduce a parameter that balances the element-wise and channel-wise proportion of stylized features with the original source domain features in the stylized source domain images. This learned parameter replaces an empirically determined global hyperparameter, allowing for more fine-grained control over the output stylized image. We conduct multiple experiments to validate the effectiveness of our proposed method. Finally, we evaluate our model on the publicly available benchmark semantic segmentation datasets (Cityscapes and SYNTHIA). Quantitative and qualitative comparisons indicate that our approach is competitive with state-of-the-art. Code is made available at \url{https://github.com/placeholder}
翻译:获取足够的标记数据进行模型训练在大多数实际应用中都是不可行的。因此,我们解决了面向语义分割任务的域通用化问题,以减少获取和标记额外数据的需要。最近针对域通用化的工作通过在图像的颜色、风格和纹理等域相关特征上进行变异来增加数据多样性。然而,过度的风格化甚至均一的风格化可能会降低性能。性能降低对于来自少数类别的像素来说尤为明显,这些像素已经比来自多数类别的像素更难分类。因此,我们引入了一个模块$ASH_{+}$,它根据像素的语义内容来调节风格化强度。在本工作中,我们还引入了一个参数来平衡风格化域源图像中风格化特征与原始源域特征之间的逐元素和逐通道比例。这个学习到的参数替换了经验确定的全局超参数,允许更精细地控制输出风格化图像。我们进行了多个实验来验证我们提出的方法的有效性。最后,我们在公开的基准语义分割数据集(Cityscapes和SYNTHIA)上评估了我们的模型。定量和定性的比较表明我们的方法与现有的最优方法相比具有竞争力。代码在\url{https://github.com/placeholder}可用。