图像Net-X: 了解带有变化说明因数的模型错误 (ImageNet-X: Understanding Model Mistakes with Factor of Variation Annotations)

Badr Youbi Idrissi,Diane Bouchacourt,Randall Balestriero,Ivan Evtimov,Caner Hazirbas,Nicolas Ballas,Pascal Vincent,Michal Drozdzal,David Lopez-Paz,Mark Ibrahim

Deep learning vision systems are widely deployed across applications where reliability is critical. However, even today's best models can fail to recognize an object when its pose, lighting, or background varies. While existing benchmarks surface examples challenging for models, they do not explain why such mistakes arise. To address this need, we introduce ImageNet-X, a set of sixteen human annotations of factors such as pose, background, or lighting the entire ImageNet-1k validation set as well as a random subset of 12k training images. Equipped with ImageNet-X, we investigate 2,200 current recognition models and study the types of mistakes as a function of model's (1) architecture, e.g. transformer vs. convolutional, (2) learning paradigm, e.g. supervised vs. self-supervised, and (3) training procedures, e.g., data augmentation. Regardless of these choices, we find models have consistent failure modes across ImageNet-X categories. We also find that while data augmentation can improve robustness to certain factors, they induce spill-over effects to other factors. For example, strong random cropping hurts robustness on smaller objects. Together, these insights suggest to advance the robustness of modern vision models, future research should focus on collecting additional data and understanding data augmentation schemes. Along with these insights, we release a toolkit based on ImageNet-X to spur further study into the mistakes image recognition systems make.

翻译：深层学习愿景系统在可靠性至关重要的应用程序中广泛部署。但是,即使今天的最佳模型可能无法在模型的构成、照明或背景不同时识别一个对象。虽然现有的基准表面示例对模型来说具有挑战性,但它们并不能解释为什么出现这种错误。为解决这一需要,我们引入了图像Net-X,这是一套16个人文说明,包括图像、背景或整个图像Net-1k验证集以及12k培训图像的随机子集,配有图像Net-X,我们调查了2 200个当前识别模型,并研究作为模型(1)结构函数的错误类型,例如变异器对组合对象的功能,(2)学习范例,例如受监督的相对于自我监督的,以及(3)培训程序,例如数据增强。不管作出这些选择,我们发现模型在整个图像网-X类别中都有一贯的失败模式。我们还发现,虽然数据增强可以提高某些因素的稳健性,但也会对其他因素产生溢出效应。例如,强大的随机裁断使较小物体的稳健性功能受损。这些洞察范模式共同建议,我们更深入地了解未来愿景的模型。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日