Unsupervised Relation Extraction (RE) aims to identify relations between entities in text, without having access to labeled data during training. This setting is particularly relevant for domain specific RE where no annotated dataset is available and for open-domain RE where the types of relations are a priori unknown. Although recent approaches achieve promising results, they heavily depend on hyperparameters whose tuning would most often require labeled data. To mitigate the reliance on hyperparameters, we propose PromptORE, a ''Prompt-based Open Relation Extraction'' model. We adapt the novel prompt-tuning paradigm to work in an unsupervised setting, and use it to embed sentences expressing a relation. We then cluster these embeddings to discover candidate relations, and we experiment different strategies to automatically estimate an adequate number of clusters. To the best of our knowledge, PromptORE is the first unsupervised RE model that does not need hyperparameter tuning. Results on three general and specific domain datasets show that PromptORE consistently outperforms state-of-the-art models with a relative gain of more than 40% in B 3 , V-measure and ARI. Qualitative analysis also indicates PromptORE's ability to identify semantically coherent clusters that are very close to true relations.
翻译:无监督的关系抽取旨在在没有标注数据的情况下识别文本中实体之间的关系。这种设置尤其适用于没有注释数据集的领域特定关系抽取和开放领域关系抽取,其中关系类型事先未知。虽然最近的方法取得了令人充满希望的结果,但它们严重依赖于超参数,其调整通常需要标注数据。为了减少对超参数的依赖性,我们提出了PromptORE,一种“基于提示的开放式关系抽取”模型。我们将新颖的提示调整范例调整为在无监督设置下工作,并用它来嵌入表达关系的句子。然后,我们对这些嵌入进行聚类以发现候选关系,并尝试不同的策略来自动估计适当数量的聚类。据我们所知,PromptORE是第一个不需要超参数调整的无监督关系抽取模型。对三个通用和特定领域数据集的结果显示,PromptORE在B 3、V-measure和ARI方面始终优于现有模型,相对收益超过40%。定性分析还表明,PromptORE能够识别与真实关系非常接近的语义连贯的聚类。