We study transfer learning in the presence of spurious correlations. We experimentally demonstrate that directly transferring the stable feature extractor learned on the source task may not eliminate these biases for the target task. However, we hypothesize that the unstable features in the source task and those in the target task are directly related. By explicitly informing the target classifier of the source task's unstable features, we can regularize the biases in the target task. Specifically, we derive a representation that encodes the unstable features by contrasting different data environments in the source task. On the target task, we cluster data from this representation, and achieve robustness by minimizing the worst-case risk across all clusters. We evaluate our method on both text and image classifications. Empirical results demonstrate that our algorithm is able to maintain robustness on the target task, outperforming the best baseline by 22.9% in absolute accuracy across 12 transfer settings. Our code is available at https://github.com/YujiaBao/Tofu.
翻译:我们实验性地表明,直接转让在源任务中学到的稳定特征提取器可能无法消除目标任务中的这些偏差。然而,我们假设源任务和目标任务中的不稳定特征是直接相关的。通过向目标分类者明确通报源任务不稳定的特征,我们可以规范目标任务中的偏差。具体地说,我们得出一个符号,通过对比源任务中的不同数据环境,将不稳定特征编码起来。关于目标任务,我们从此将数据集中起来,通过在所有组别中尽量减少最坏的风险实现稳健。我们评估了我们关于文本和图像分类的方法。经验性结果表明,我们的算法能够在目标任务上保持稳健,在12个传输环境中以22.9%的绝对精确度超过最佳基线。我们的代码可以在https://github.com/Yujiabao/Tofu上查阅。