Distribution shifts are a major source of failure of deployed machine learning models. However, evaluating a model's reliability under distribution shifts can be challenging, especially since it may be difficult to acquire counterfactual examples that exhibit a specified shift. In this work, we introduce dataset interfaces: a framework which allows users to scalably synthesize such counterfactual examples from a given dataset. Specifically, we represent each class from the input dataset as a custom token within the text space of a text-to-image diffusion model. By incorporating these tokens into natural language prompts, we can then generate instantiations of objects in that dataset under desired distribution shifts. We demonstrate how applying our framework to the ImageNet dataset enables us to study model behavior across a diverse array of shifts, including variations in background, lighting, and attributes of the objects themselves. Code available at https://github.com/MadryLab/dataset-interfaces.
翻译:分布式转换是已部署的机器学习模型失败的一个主要原因。 但是,在分布式转换中评价模型的可靠性可能具有挑战性,特别是因为可能难以获得显示特定变化的反事实例子。 在这项工作中,我们引入了数据集界面:一个使用户能够将特定数据集中的此类反事实例子进行大规模合成的框架。具体地说,我们在文本到图像传播模型的文本空间中代表输入数据集中的每个类别作为自定义符号。通过将这些符号纳入自然语言提示,我们就可以在想要的分布式转换中生成该数据集中的物体即时反应。我们展示了如何将我们的框架应用到图像网数据集,使我们能够研究各种变化的模型行为,包括这些物体本身的背景、照明和属性的变化。代码可在 https://github.com/MadryLab/dataset-interfaces查阅。