Relation extraction (RE), which has relied on structurally annotated corpora for model training, has been particularly challenging in low-resource scenarios and domains. Recent literature has tackled low-resource RE by self-supervised learning, where the solution involves pretraining the relation embedding by RE-based objective and finetuning on labeled data by classification-based objective. However, a critical challenge to this approach is the gap in objectives, which prevents the RE model from fully utilizing the knowledge in pretrained representations. In this paper, we aim at bridging the gap and propose to pretrain and finetune the RE model using consistent objectives of contrastive learning. Since in this kind of representation learning paradigm, one relation may easily form multiple clusters in the representation space, we further propose a multi-center contrastive loss that allows one relation to form multiple clusters to better align with pretraining. Experiments on two document-level RE datasets, BioRED and Re-DocRED, demonstrate the effectiveness of our method. Particularly, when using 1% end-task training data, our method outperforms PLM-based RE classifier by 10.5% and 5.8% on the two datasets, respectively.
翻译:在低资源情景和领域,最近的文献通过自我监督的学习来解决资源低的可再生能源问题,解决办法包括预先训练RE基于的目标和对标签数据进行基于分类的目标进行微调;然而,这一方法面临的一个重大挑战是目标上的差距,这种差距使RE模型无法充分利用在预先培训的演示中的知识。在本文中,我们的目标是弥合差距,并提议利用对比学习的一致目标预先培训和微调RE模型。由于在这种代表性学习模式中,一种关系很容易形成代表空间的多个集群,我们进一步提议一种多指标对比性损失,使一种关系能够形成多个集群,以便与预先培训更好地保持一致。在两个文件级的RED和Re-DocRED上进行的实验显示了我们方法的有效性。特别是,在使用1%的最终任务培训数据时,我们的方法比以PLM为基础的REG分类器分别高出10.5%和5.8%。