For real-world applications of machine learning (ML), it is essential that models make predictions based on well-generalizing features rather than spurious correlations in the data. The identification of such spurious correlations, also known as shortcuts, is a challenging problem and has so far been scarcely addressed. In this work, we present a novel approach to detect shortcuts in image and audio datasets by leveraging variational autoencoders (VAEs). The disentanglement of features in the latent space of VAEs allows us to discover correlations in datasets and semi-automatically evaluate them for ML shortcuts. We demonstrate the applicability of our method on several real-world datasets and identify shortcuts that have not been discovered before. Based on these findings, we also investigate the construction of shortcut adversarial examples.
翻译:对于机器学习的实际应用(ML)来说,模型必须在数据中根据广泛化的特征而不是虚假的关联性作出预测。确定这种虚假的关联性,又称捷径,是一个具有挑战性的问题,迄今为止还很少得到处理。在这项工作中,我们提出了一个新颖的方法,利用变异自动转换器(VAEs)来探测图像和音频数据集中的捷径。VAEs潜在空间的特征的分解使我们能够发现数据集中的关联性,并半自动地对ML捷径进行评估。我们展示了我们的方法在几个真实世界数据集中的可适用性,并确定了以前未曾发现的捷径。根据这些发现,我们还调查了如何构建捷径对立实例。