Recent research on robustness has revealed significant performance gaps between neural image classifiers trained on datasets that are similar to the test set, and those that are from a naturally shifted distribution, such as sketches, paintings, and animations of the object categories observed during training. Prior work focuses on reducing this gap by designing engineered augmentations of training data or through unsupervised pretraining of a single large model on massive in-the-wild training datasets scraped from the Internet. However, the notion of a dataset is also undergoing a paradigm shift in recent years. With drastic improvements in the quality, ease-of-use, and access to modern generative models, generated data is pervading the web. In this light, we study the question: How do these generated datasets influence the natural robustness of image classifiers? We find that Imagenet classifiers trained on real data augmented with generated data achieve higher accuracy and effective robustness than standard training and popular augmentation strategies in the presence of natural distribution shifts. We analyze various factors influencing these results, including the choice of conditioning strategies and the amount of generated data. Lastly, we introduce and analyze an evolving generated dataset, ImageNet-G-v1, to better benchmark the design, utility, and critique of standalone generated datasets for robust and trustworthy machine learning. The code and datasets are available at https://github.com/Hritikbansal/generative-robustness.
翻译:最近对稳健性的研究显示,在与测试组相类似的数据集方面受过训练的神经图像分类师与在培训期间观察到的物体类别自然变化分布,例如草图、绘画和动画等,自然转移分布的神经图像分类师之间存在显著的绩效差距。先前的工作重点是缩小这一差距,办法是设计工程化的培训数据扩充,或未经监督地对从因特网中分离出来的大规模动态培训数据集的单一大型模型进行预先培训。然而,近年来,数据集的概念也正在发生范式转变。随着质量、易用性和获取现代基因模型的机会的大幅改进,所产生的数据渗透到网络。我们研究的问题是:这些生成的数据集如何影响图像分类师的自然稳健性形象分类?我们发现,在产生数据的同时,经过培训的图像网络分类员的准确性和有效性高于在自然分布变化中进行的标准培训和民众增强战略。我们分析了影响这些结果的各种因素,包括调适战略的选择和生成数据的数量。最后,我们介绍和分析正在演变的、可靠性数据基准和可靠性数据库,在数据库中,我们介绍并分析正在形成的数据基准和可靠性数据库。