Stance detection models may tend to rely on dataset bias in the text part as a shortcut and thus fail to sufficiently learn the interaction between the targets and texts. Recent debiasing methods usually treated features learned by small models or big models at earlier steps as bias features and proposed to exclude the branch learning those bias features during inference. However, most of these methods fail to disentangle the ``good'' stance features and ``bad'' bias features in the text part. In this paper, we investigate how to mitigate dataset bias in stance detection. Motivated by causal effects, we leverage a novel counterfactual inference framework, which enables us to capture the dataset bias in the text part as the direct causal effect of the text on stances and reduce the dataset bias in the text part by subtracting the direct text effect from the total causal effect. We novelly model bias features as features that correlate with the stance labels but fail on intermediate stance reasoning subtasks and propose an adversarial bias learning module to model the bias more accurately. To verify whether our model could better model the interaction between texts and targets, we test our model on recently proposed test sets to evaluate the understanding of the task from various aspects. Experiments demonstrate that our proposed method (1) could better model the bias features, and (2) outperforms existing debiasing baselines on both the original dataset and most of the newly constructed test sets.
翻译:Stance 检测模型可能倾向于以文本部分的数据集偏差作为捷径,从而无法充分了解目标和文本之间的相互作用。最近的一些偏差方法通常将小模型或大模型在早期步骤中学到的特征作为偏差特征处理,并提议排除分支在推断过程中学习这些偏差特征。然而,这些方法大多没有将“好”的姿态特征和文本部分中的“坏”偏差特征分解开来。在本文中,我们调查如何减轻定位检测中的数据集偏差。受因果关系的影响,我们利用一个新的反事实推论框架,使我们能够将文本部分中的数据集偏差作为文本立场上的直接因果关系,并通过从整体因果关系效果中减去直接文本效应来减少文本部分的偏差。我们新颖地将偏差特征作为与“好”标签相关,但在中间姿态推理子任务上却失败的特征,并提议一个对抗偏差学习模块来更准确地模拟偏差。为了核实我们的模型能否更好地模拟原始文本和目标之间的相互作用,从而使我们能够将文本和具体目标部分的偏差作为文本的直接因果关系,我们测试最近提议的任务测试的模型的模型,以便更准确地评估现有各项任务测试基准。