While unbiased machine learning models are essential for many applications, bias is a human-defined concept that can vary across tasks. Given only input-label pairs, algorithms may lack sufficient information to distinguish stable (causal) features from unstable (spurious) features. However, related tasks often share similar biases -- an observation we may leverage to develop stable classifiers in the transfer setting. In this work, we explicitly inform the target classifier about unstable features in the source tasks. Specifically, we derive a representation that encodes the unstable features by contrasting different data environments in the source task. We achieve robustness by clustering data of the target task according to this representation and minimizing the worst-case risk across these clusters. We evaluate our method on both text and image classifications. Empirical results demonstrate that our algorithm is able to maintain robustness on the target task for both synthetically generated environments and real-world environments.
翻译:虽然公正的机器学习模式对许多应用至关重要,但偏向是人类定义的概念,可以因任务而异。只考虑到投入标签配对,算法可能缺乏足够的信息来区分稳定(因果)特征和不稳定(净)特征。然而,相关任务往往具有类似的偏差。但相关任务往往具有相似的偏差 -- -- 我们可能会利用这一观察来在传输环境中开发稳定的分类器。在这项工作中,我们明确告知目标分类员源任务中的不稳定特征。具体地说,我们通过对比源任务中的不同数据环境,得出一种能将不稳定特征编码起来的表述。我们通过根据这一表述将目标任务的数据组合起来,并最大限度地减少这些组群中最坏的风险。我们评估了我们的文本和图像分类方法。经验性结果表明,我们的算法能够保持合成生成的环境和真实世界环境的目标任务上的稳健性。