Natural language understanding (NLU) models often rely on dataset biases rather than intended task-relevant features to achieve high performance on specific datasets. As a result, these models perform poorly on datasets outside the training distribution. Some recent studies address this issue by reducing the weights of biased samples during the training process. However, these methods still encode biased latent features in representations and neglect the dynamic nature of bias, which hinders model prediction. We propose an NLU debiasing method, named debiasing contrastive learning (DCT), to simultaneously alleviate the above problems based on contrastive learning. We devise a debiasing, positive sampling strategy to mitigate biased latent features by selecting the least similar biased positive samples. We also propose a dynamic negative sampling strategy to capture the dynamic influence of biases by employing a bias-only model to dynamically select the most similar biased negative samples. We conduct experiments on three NLU benchmark datasets. Experimental results show that DCT outperforms state-of-the-art baselines on out-of-distribution datasets while maintaining in-distribution performance. We also verify that DCT can reduce biased latent features from the model's representation.
翻译:自然语言理解(NLU)模型往往依赖数据集偏差,而不是预期的任务相关特征,以在具体数据集上取得高绩效。因此,这些模型在培训分布范围外的数据集上表现不佳。最近的一些研究通过在培训过程中减少偏差样本的重量来解决这一问题。然而,这些方法仍然将偏差潜在特征在表现中的偏差编码,并忽视偏见的动态性质,从而妨碍模型预测。我们提出了一个NLU偏差偏差方法,命名为降低偏差对比学习(DCT),以同时根据对比性学习来缓解上述问题。我们制定了一种偏差、积极的抽样战略,通过选择最相似的偏差积极样本来减少偏差潜在特征。我们还提出了一个动态负面抽样战略,通过使用偏差模式来动态地选择最相似的偏差负面样本来捕捉偏差的动态影响。我们在三个NLU基准数据集上进行实验。实验结果显示,DCT在保持分流性性性性表现的同时,会降低DCT的偏差性特征。