The generalization with respect to domain shifts, as they frequently appear in applications such as autonomous driving, is one of the remaining big challenges for deep learning models. Therefore, we propose an intra-source style augmentation (ISSA) method to improve domain generalization in semantic segmentation. Our method is based on a novel masked noise encoder for StyleGAN2 inversion. The model learns to faithfully reconstruct the image preserving its semantic layout through noise prediction. Random masking of the estimated noise enables the style mixing capability of our model, i.e. it allows to alter the global appearance without affecting the semantic layout of an image. Using the proposed masked noise encoder to randomize style and content combinations in the training set, ISSA effectively increases the diversity of training data and reduces spurious correlation. As a result, we achieve up to $12.4\%$ mIoU improvements on driving-scene semantic segmentation under different types of data shifts, i.e., changing geographic locations, adverse weather conditions, and day to night. ISSA is model-agnostic and straightforwardly applicable with CNNs and Transformers. It is also complementary to other domain generalization techniques, e.g., it improves the recent state-of-the-art solution RobustNet by $3\%$ mIoU in Cityscapes to Dark Z\"urich.
翻译:在诸如自主驾驶等应用中经常出现的关于域变的普及性,是深层学习模型的剩余重大挑战之一。 因此, 我们提出一种源码风格增强( ISSA) 方法, 以改进语义分割的域化。 我们的方法是基于StypeGAN2 的新型掩码噪音编码器。 模型学会通过噪音预测忠实地重建图像保存其语义布局。 随机遮盖估计噪音能够使我们模型的风格混合能力, 即它能够改变全球外观而不影响图像的语义布局。 使用提议的暗色调调音编码器来随机化培训集的风格和内容组合, ISSA有效地增加了培训数据的多样性, 并减少了虚假的关联性。 结果, 在不同类型的数据变化下, 即改变地理位置、 恶劣天气条件和白天到夜间。 ISSA是模型化的模型, 直接应用于近代网络和网络的模型。 3 IMIS- Rob- translations Genal- seral- settroal- supal- supal- supal- sqreal- supal- syal- syal- syal- settal- supal- settnational- sex- syal- sets- surview.