Empirical studies suggest that machine learning models trained with empirical risk minimization (ERM) often rely on attributes that may be spuriously correlated with the class labels. Such models typically lead to poor performance during inference for data lacking such correlations. In this work, we explicitly consider a situation where potential spurious correlations are present in the majority of training data. In contrast with existing approaches, which use the ERM model outputs to detect the samples without spurious correlations, and either heuristically upweighting or upsampling those samples; we propose the logit correction (LC) loss, a simple yet effective improvement on the softmax cross-entropy loss, to correct the sample logit. We demonstrate that minimizing the LC loss is equivalent to maximizing the group-balanced accuracy, so the proposed LC could mitigate the negative impacts of spurious correlations. Our extensive experimental results further reveal that the proposed LC loss outperforms the SoTA solutions on multiple popular benchmarks by a large margin, an average 5.5% absolute improvement, without access to spurious attribute labels. LC is also competitive with oracle methods that make use of the attribute labels. Code is available at https://github.com/shengliu66/LC.
翻译:经验研究表明,经过实验风险最小化(ERM)培训的机器学习模式往往依赖可能与阶级标签有虚假关联的属性,这类模式通常导致在推断缺乏这种关联性的数据时性能差,在这项工作中,我们明确考虑在大多数培训数据中存在潜在虚假关联的情况。与现有的方法不同,即利用机构风险管理模式产出在没有虚假关联的情况下检测样本,或者过度加分或抽取这些样本;我们提议对逻辑更正(LC)损失进行逻辑更正,这是对软体交叉渗透性损失的简单而有效的改进,以纠正抽样日志。我们表明,最大限度地减少LC损失相当于最大限度地实现群体平衡性,因此拟议的LC可以减轻虚假关联的负面影响。我们广泛的实验结果还表明,拟议的LC损失率大大超过多种流行基准的SoTA解决方案,平均5.5%的绝对改进,无法获取虚假的属性标签。LC还具有竞争力,而且使用正统的 ALC/CS 。