Paraphrase Identification is a fundamental task in Natural Language Processing. While much progress has been made in the field, the performance of many state-of-the-art models often suffer from distribution shift during inference time. We verify that a major source of this performance drop comes from biases introduced by negative examples. To overcome these biases, we propose in this paper to train two separate models, one that only utilizes the positive pairs and the other the negative pairs. This enables us the option of deciding how much to utilize the negative model, for which we introduce a perplexity based out-of-distribution metric that we show can effectively and automatically determine how much weight it should be given during inference. We support our findings with strong empirical results.
翻译:参数识别是自然语言处理的一项基本任务。虽然在实地已经取得了很大进展,但许多最先进的模型的绩效往往在推论期间因分布转移而受到影响。我们核实这一绩效下降的一个主要来源是负面例子带来的偏差。为了克服这些偏差,我们在本文件中建议培训两个不同的模型,一个只使用正对,另一个只使用负对。这使我们能够选择如何利用负面模型,我们为此采用了基于分配差异的、基于我们所显示的有效和自动确定在推论期间应给多少份量的基于分配差异的衡量标准。我们以有力的经验结果支持我们的调查结果。