Natural language understanding (NLU) models tend to rely on spurious correlations (i.e., dataset bias) to achieve high performance on in-distribution datasets but poor performance on out-of-distribution ones. Most of the existing debiasing methods often identify and weaken these samples with biased features (i.e., superficial surface features that cause such spurious correlations). However, down-weighting these samples obstructs the model in learning from the non-biased parts of these samples. To tackle this challenge, in this paper, we propose to eliminate spurious correlations in a fine-grained manner from a feature space perspective. Specifically, we introduce Random Fourier Features and weighted re-sampling to decorrelate the dependencies between features to mitigate spurious correlations. After obtaining decorrelated features, we further design a mutual-information-based method to purify them, which forces the model to learn features that are more relevant to tasks. Extensive experiments on two well-studied NLU tasks demonstrate that our method is superior to other comparative approaches.
翻译:自然语言理解模型(NLU)往往依赖虚假的关联(即数据集偏差),以在分布数据集上取得高性能,但在分布范围外数据集上则表现差。大多数现有的贬低方法往往发现和削弱这些具有偏差特征的样本(即表面表面特征,造成这种虚假关联)。然而,这些样本的下加权阻碍了从这些样本中无偏见部分学习的模型。为了应对这一挑战,我们在本文件中提议从地貌空间角度以精细微的方式消除虚假的关联。具体地说,我们引入随机四面形特征和加权再抽样,以调整特征之间的依赖性,以缓解虚假关联性。在获得与腐蚀性相关的特征后,我们进一步设计了一种基于相互信息的方法来净化这些样本,这迫使模型学习与任务更相关的特征。在两项经过广泛研究的NLU任务中进行的广泛实验表明,我们的方法优于其他比较方法。