Recent relation extraction (RE) works have shown encouraging improvements by conducting contrastive learning on silver labels generated by distant supervision before fine-tuning on gold labels. Existing methods typically assume all these silver labels are accurate and therefore treat them equally in contrastive learning; however, distant supervision is inevitably noisy -- some silver labels are more reliable than others. In this paper, we first assess the quality of silver labels via a simple and automatic approach we call "learning order denoising," where we train a language model to learn these relations and record the order of learned training instances. We show that learning order largely corresponds to label accuracy -- early learned silver labels have, on average, more accurate labels compared to later learned silver labels. We then propose a novel fine-grained contrastive learning (FineCL) for RE, which leverages this additional, fine-grained information about which silver labels are and are not noisy to improve the quality of learned relationship representations for RE. Experiments on many RE benchmarks show consistent, significant performance gains of FineCL over state-of-the-art methods.
翻译:最近的关系提取(RE)工程显示,通过在对金标签进行微调之前对远方监督产生的银标签进行对比性学习,取得了令人鼓舞的改进。现有的方法通常假定所有这些银标签都是准确的,因此对之一视同仁;然而,远方监督不可避免地是吵闹的 -- -- 一些银标签比其他标签更可靠。在本文中,我们首先通过简单和自动的方法评估银标签的质量,我们称之为 " 学习分解顺序 ",我们在那里培训一种语言模式来学习这些关系并记录学习经验培训实例的顺序。我们显示,学习顺序在很大程度上与标签的准确性相符 -- -- 早期学习的银标签与后来学到的银标签相比,平均而言更加准确。我们然后为RE提出一个新的精细细的对比性学习(FineCL),它利用关于哪些银标签是精细精细的更多信息来提高RE的学习关系表述质量。关于许多可再生能源基准的实验显示,FineCL公司在最新方法上取得一致、显著的业绩收益。