Empirical studies suggest that machine learning models often rely on features, such as the background, that may be spuriously correlated with the label only during training time, resulting in poor accuracy during test-time. In this work, we identify the fundamental factors that give rise to this behavior, by explaining why models fail this way {\em even} in easy-to-learn tasks where one would expect these models to succeed. In particular, through a theoretical study of gradient-descent-trained linear classifiers on some easy-to-learn tasks, we uncover two complementary failure modes. These modes arise from how spurious correlations induce two kinds of skews in the data: one geometric in nature, and another, statistical in nature. Finally, we construct natural modifications of image classification datasets to understand when these failure modes can arise in practice. We also design experiments to isolate the two failure modes when training modern neural networks on these datasets.
翻译:经验研究表明,机器学习模式往往依赖一些特征,例如背景,这些特征在培训期间可能与标签有虚假的联系,导致测试时间的准确性差。在这项工作中,我们通过解释为什么模型在容易读取的任务中以这种方式失败,人们期望这些模型取得成功,从而解释为什么模型在容易读取的任务中以这种方式失败。特别是,通过对一些容易阅读的任务的梯度-白种经培训的线性分级人员进行理论研究,我们发现了两种互补的失败模式。这些模式源自于虚假的相互关系如何在数据中引起两种扭曲:一种性质上的几何学,另一种性质上的统计。最后,我们设计了图像分类数据集的自然修改,以了解这些失败模式在实际中可能出现的时候。我们还设计了实验,在培训这些数据集的现代神经网络时将两种失败模式分离出来。