Despite the success of machine learning applications in science, industry, and society in general, many approaches are known to be non-robust, often relying on spurious correlations to make predictions. Spuriousness occurs when some features correlate with labels but are not causal; relying on such features prevents models from generalizing to unseen environments where such correlations break. In this work, we focus on image classification and propose two data generation processes to reduce spuriousness. Given human annotations of the subset of the features responsible (causal) for the labels (e.g. bounding boxes), we modify this causal set to generate a surrogate image that no longer has the same label (i.e. a counterfactual image). We also alter non-causal features to generate images still recognized as the original labels, which helps to learn a model invariant to these features. In several challenging datasets, our data generations outperform state-of-the-art methods in accuracy when spurious correlations break, and increase the saliency focus on causal features providing better explanations.
翻译:尽管在科学、工业和整个社会的机器学习应用取得了成功,但人们知道许多方法都是非野蛮的,往往依赖虚假的关联来作出预测。当某些特征与标签相关,但并非因果关系时,就会出现净化;依赖这些特征使模型无法概括到这种关联破碎的无形环境中。在这项工作中,我们侧重于图像分类,并提出两个数据生成程序以减少虚假性。鉴于对标签负责的(因果)特征(如捆绑框)的人类说明,我们修改这一因果组合以产生不再具有相同标签的替代图像(即反事实图像),我们还改变非因果特征以生成仍然被确认为原始标签的图像,这有助于学习这些特征的变异模型。在几个挑战性数据集中,我们的数据代代在虚假关联破裂时超越了准确性的最新方法,并增加了对因果关系特征的突出重点,提供了更好的解释。