We propose a distance supervised relation extraction approach for long-tailed, imbalanced data which is prevalent in real-world settings. Here, the challenge is to learn accurate "few-shot" models for classes existing at the tail of the class distribution, for which little data is available. Inspired by the rich semantic correlations between classes at the long tail and those at the head, we take advantage of the knowledge from data-rich classes at the head of the distribution to boost the performance of the data-poor classes at the tail. First, we propose to leverage implicit relational knowledge among class labels from knowledge graph embeddings and learn explicit relational knowledge using graph convolution networks. Second, we integrate that relational knowledge into relation extraction model by coarse-to-fine knowledge-aware attention mechanism. We demonstrate our results for a large-scale benchmark dataset which show that our approach significantly outperforms other baselines, especially for long-tail relations.
翻译:我们提议对长尾和偏差的数据采用远程监控关系提取方法,这是现实世界环境中普遍存在的数据。这里的挑战在于为班级分布尾端的现有班级学习准确的“闪光”模型,因为几乎没有数据。受长尾和顶端班级之间丰富的语义相关性的启发,我们利用分布头端的富含数据类的知识来提高尾端数据贫乏类的性能。首先,我们提议利用知识图嵌入的各类标签之间的隐性关系知识,并利用图解相联网络学习明确的关联知识。第二,我们通过粗到松知识认知关注机制将这种关联知识纳入关系提取模型。我们展示了大规模基准数据集的结果,显示我们的方法大大超过其他基线,特别是长尾关系。