Motivated by the increasing popularity of transformers in computer vision, in recent times there has been a rapid development of novel architectures. While in-domain performance follows a constant, upward trend, properties like robustness or uncertainty estimation are less explored -leaving doubts about advances in model reliability. Studies along these axes exist, but they are mainly limited to classification models. In contrast, we carry out a study on semantic segmentation, a relevant task for many real-world applications where model reliability is paramount. We analyze a broad variety of models, spanning from older ResNet-based architectures to novel transformers and assess their reliability based on four metrics: robustness, calibration, misclassification detection and out-of-distribution (OOD) detection. We find that while recent models are significantly more robust, they are not overall more reliable in terms of uncertainty estimation. We further explore methods that can come to the rescue and show that improving calibration can also help with other uncertainty metrics such as misclassification or OOD detection. This is the first study on modern segmentation models focused on both robustness and uncertainty estimation and we hope it will help practitioners and researchers interested in this fundamental vision task. Code available at https://github.com/naver/relis.
翻译:语义分割的可靠性:我们走在正确的道路上吗?
转化摘要:
在计算机视觉中,由于transformers的日益流行,近年来出现了许多新颖的架构。虽然在同一领域表现呈恒定上升趋势,但诸如鲁棒性或不确定性估计之类的属性却受到较少的探索,这使得模型可靠性的进步备受怀疑。虽然存在沿这些轴线的研究,但它们主要限于分类模型。相反,我们进行了一项关于语义分割的研究,这是许多实际应用程序的重要任务,模型的可靠性至关重要。我们分析了广泛的模型,从较旧的基于ResNet的体系结构到新颖的transformers,并根据四个指标评估它们的可靠性:鲁棒性、校准、误分类检测和超出分布(OOD)检测。我们发现,虽然最近的模型明显更加鲁棒,但总体上它们在不确定性估计方面并不更加可靠。我们进一步探讨了可能帮助改进这一情况的方法,并表明改善校准也可以有助于其他不确定性指标,例如误分类或OOD检测。这是第一项针对现代分割模型,聚焦于鲁棒性和不确定性估计的研究,我们希望它将有助于对这一基本视觉任务感兴趣的从业者和研究人员。代码可在https://github.com/naver/relis中获取 。