Distant supervision for relation extraction provides uniform bag labels for each sentence inside the bag, while accurate sentence labels are important for downstream applications that need the exact relation type. Directly using bag labels for sentence-level training will introduce much noise, thus severely degrading performance. In this work, we propose the use of negative training (NT), in which a model is trained using complementary labels regarding that ``the instance does not belong to these complementary labels". Since the probability of selecting a true label as a complementary label is low, NT provides less noisy information. Furthermore, the model trained with NT is able to separate the noisy data from the training data. Based on NT, we propose a sentence-level framework, SENT, for distant relation extraction. SENT not only filters the noisy data to construct a cleaner dataset, but also performs a re-labeling process to transform the noisy data into useful training data, thus further benefiting the model's performance. Experimental results show the significant improvement of the proposed method over previous methods on sentence-level evaluation and de-noise effect.
翻译:在这项工作中,我们提议使用负面培训(NT)培训模型,使用补充标签进行“实例不属于这些补充标签”,因为选择真实标签作为补充标签的可能性较低,NT提供的信息也较少。此外,在NT培训中,经过培训的模型能够将噪音数据与培训数据分开。基于NT,我们提出了一个判决级别框架,SENT,用于远程提取。SENT不仅过滤噪音数据,以构建更清洁的数据集,而且还进行重新标签程序,将噪音数据转化为有用的培训数据,从而进一步使模型的性能受益。实验结果显示,拟议的方法大大改进了先前的判刑级别评估和去噪音效果方法。