To reduce human annotations for relation extraction (RE) tasks, distantly supervised approaches have been proposed, while struggling with low performance. In this work, we propose a novel DSRE-NLI framework, which considers both distant supervision from existing knowledge bases and indirect supervision from pretrained language models for other tasks. DSRE-NLI energizes an off-the-shelf natural language inference (NLI) engine with a semi-automatic relation verbalization (SARV) mechanism to provide indirect supervision and further consolidates the distant annotations to benefit multi-classification RE models. The NLI-based indirect supervision acquires only one relation verbalization template from humans as a semantically general template for each relationship, and then the template set is enriched by high-quality textual patterns automatically mined from the distantly annotated corpus. With two simple and effective data consolidation strategies, the quality of training data is substantially improved. Extensive experiments demonstrate that the proposed framework significantly improves the SOTA performance (up to 7.73\% of F1) on distantly supervised RE benchmark datasets.
翻译:为了减少关系提取(RE)任务的人类说明,提出了远程监督办法,同时与低绩效作斗争。在这项工作中,我们提议了一个新的DSRE-NLI框架,既考虑从现有知识库进行远程监督,又考虑从预先培训的语言模式为其他任务进行间接监督。DSRE-NLI为现成的自然语言推断(NLI)引擎注入了一种现成的自然语言推断(NLI)引擎,配有半自动关系口头说明(SARV)机制,以提供间接监督并进一步整合远端说明,使多级化的RE模型受益。基于NLI的间接监督只从人处获得一个相关语言表达模板,作为每个关系的语义性一般模板,然后通过从远处附加说明的文体中自动提取的高质量文字模式来充实这套模板。通过两种简单有效的数据整合战略,培训数据的质量得到大幅提高。广泛的实验表明,拟议的框架大大改进了远端监控的REDA基准数据集的SOTA性(最高为F1的7.73 ⁇ )。