Neural Encoders are frequently used in the NLP domain to perform dense retrieval tasks, for instance, to generate the candidate documents for a given query in question-answering tasks. However, sparse annotation and label noise in the training data make it challenging to train or fine-tune such retrieval models. Although existing works have attempted to mitigate these problems by incorporating modified loss functions or data cleaning, these approaches either require some hyperparameters to tune during training or add substantial complexity to the training setup. In this work, we consider a label weakening approach to generate robust retrieval models in the presence of label noise. Instead of enforcing a single, potentially erroneous label for each query document pair, we allow for a set of plausible labels derived from both the observed supervision and the model's confidence scores. We perform an extensive evaluation considering two retrieval models, one re-ranking model, considering four diverse ranking datasets. To this end, we also consider a realistic noisy setting by using a semantic-aware noise generation technique to generate different ratios of noise. Our initial results show that label weakening can improve the performance of the retrieval tasks in comparison to 10 different state-of-the-art loss functions.
翻译:神经编码器在自然语言处理领域常被用于执行密集检索任务,例如在问答任务中为给定查询生成候选文档。然而,训练数据中的稀疏标注和标签噪声使得训练或微调此类检索模型具有挑战性。尽管现有研究尝试通过改进损失函数或数据清洗来缓解这些问题,但这些方法要么需要在训练过程中调整超参数,要么显著增加了训练设置的复杂性。在本研究中,我们提出一种标签弱化方法,以在存在标签噪声的情况下生成鲁棒的检索模型。我们不再为每个查询-文档对强制指定单一且可能错误的标签,而是允许基于观测监督和模型置信度得分推导出一组合理的标签。我们进行了广泛评估,涵盖两种检索模型和一种重排序模型,并基于四个不同的排序数据集。为此,我们还通过使用语义感知的噪声生成技术生成不同比例的噪声,构建了现实的噪声环境。初步结果表明,与10种不同的先进损失函数相比,标签弱化方法能够提升检索任务的性能。