Enhancing the robustness of vision algorithms in real-world scenarios is challenging. One reason is that existing robustness benchmarks are limited, as they either rely on synthetic data or ignore the effects of individual nuisance factors. We introduce OOD-CV, a benchmark dataset that includes out-of-distribution examples of 10 object categories in terms of pose, shape, texture, context and the weather conditions, and enables benchmarking models for image classification, object detection, and 3D pose estimation. In addition to this novel dataset, we contribute extensive experiments using popular baseline methods, which reveal that: 1. Some nuisance factors have a much stronger negative effect on the performance compared to others, also depending on the vision task. 2. Current approaches to enhance robustness have only marginal effects, and can even reduce robustness. 3. We do not observe significant differences between convolutional and transformer architectures. We believe our dataset provides a rich testbed to study robustness and will help push forward research in this area.
翻译:在现实世界情景中,加强视觉算法的稳健性是一个挑战。一个原因是,现有的稳健性基准有限,因为它们要么依赖合成数据,要么忽视了个人骚扰因素的影响。我们引入OOD-CV,这是一个基准数据集,包括10个对象类别外分布的外分布例子,包括外分布的形状、形状、质地、背景和天气条件,并能够为图像分类、对象探测和3D构成估计提供基准模型。除了这个新的数据集外,我们还利用流行的基线方法进行了广泛的实验,这些实验表明:1. 某些扰动因素对性能的负面影响比其他因素大得多,也取决于愿景任务。 2. 目前加强稳健性的方法只能产生边际效应,甚至可以降低稳健性。 3. 我们没有观察到卷变结构之间的重大差异。 我们相信,我们的数据集提供了一个丰富的测试台,可以研究稳健性,有助于推进这一领域的研究。