This paper presents a comprehensive evaluation of instance segmentation models with respect to real-world image corruptions as well as out-of-domain image collections, e.g. images captured by a different set-up than the training dataset. The out-of-domain image evaluation shows the generalization capability of models, an essential aspect of real-world applications and an extensively studied topic of domain adaptation. These presented robustness and generalization evaluations are important when designing instance segmentation models for real-world applications and picking an off-the-shelf pretrained model to directly use for the task at hand. Specifically, this benchmark study includes state-of-the-art network architectures, network backbones, normalization layers, models trained starting from scratch versus pretrained networks, and the effect of multi-task training on robustness and generalization. Through this study, we gain several insights. For example, we find that group normalization enhances the robustness of networks across corruptions where the image contents stay the same but corruptions are added on top. On the other hand, batch normalization improves the generalization of the models across different datasets where statistics of image features change. We also find that single-stage detectors do not generalize well to larger image resolutions than their training size. On the other hand, multi-stage detectors can easily be used on images of different sizes. We hope that our comprehensive study will motivate the development of more robust and reliable instance segmentation models.
翻译:本文全面评价了真实世界图像腐败以及外部图像收集的外表分解模型,例如由与培训数据集不同的结构所摄取的图像。外部图像评价显示了模型的概括性能力、现实世界应用的一个基本方面和广泛研究的领域适应主题。这些展示了稳健性和概括性评价在设计真实世界应用的分解模型和挑选直接用于当前任务的现成的预成模型时非常重要。具体而言,本基准研究包括了最新网络结构、网络骨干、正常化层、从零到预培训网络所培训的模式,以及多任务培训对稳健性和概括性的影响。通过这项研究,我们得到了一些深刻的见解。例如,我们发现群体标准化可以加强腐败网络的稳健性,因为图像内容保持不变,但腐败在顶端增加。另一方面,分批正常化改进了不同数据集的通用模型的通用性,在这种结构中,从零到未经培训的网络规模,我们还可以发现,在图像的更大比例上进行更简单的统计,我们也可以发现,在普通的测试中找到,对于图像的升级程度进行更简单的分析。