Neural networks trained with ERM (empirical risk minimization) sometimes learn unintended decision rules, in particular when their training data is biased, i.e., when training labels are strongly correlated with undesirable features. To prevent a network from learning such features, recent methods augment training data such that examples displaying spurious correlations (i.e., bias-aligned examples) become a minority, whereas the other, bias-conflicting examples become prevalent. However, these approaches are sometimes difficult to train and scale to real-world data because they rely on generative models or disentangled representations. We propose an alternative based on mixup, a popular augmentation that creates convex combinations of training examples. Our method, coined SelecMix, applies mixup to contradicting pairs of examples, defined as showing either (i) the same label but dissimilar biased features, or (ii) different labels but similar biased features. Identifying such pairs requires comparing examples with respect to unknown biased features. For this, we utilize an auxiliary contrastive model with the popular heuristic that biased features are learned preferentially during training. Experiments on standard benchmarks demonstrate the effectiveness of the method, in particular when label noise complicates the identification of bias-conflicting examples.
翻译:通过机构风险管理培训的神经网络(将风险降到最低程度)有时会学习无意决定规则,特别是当培训数据偏差时,特别是当培训标签与不良特征密切相关时。为了防止网络学习这些特征,最近的方法增加了培训数据,使显示虚假关联(即偏见对比实例)的示例成为少数,而其他带有偏见冲突性的实例则变得普遍。然而,这些方法有时难以培训和推广到真实世界的数据,因为它们依赖基因化模型或分解的表述。我们建议了一种基于混合的替代模式,即大众化增强,形成培训实例的组合。我们的方法,即硬体SelecMix,将混合应用到相互矛盾的范例组合中,其定义是显示(一)同一标签,但不同偏差特征,或(二)不同标签,但有相似的偏差特征。确定这些配对需要比较与未知的偏差特征的示例。为此,我们采用了一种辅助性对比模型,即偏差特征在培训期间得到优先学习。我们的方法,即硬性特征的普及性放大模型,在标准性标签中比较冲突性模型时,比较了方法的有效性。