Recently, there is an effort to extend fine-grained entity typing by using a richer and ultra-fine set of types, and labeling noun phrases including pronouns and nominal nouns instead of just named entity mentions. A key challenge for this ultra-fine entity typing task is that human annotated data are extremely scarce, and the annotation ability of existing distant or weak supervision approaches is very limited. To remedy this problem, in this paper, we propose to obtain training data for ultra-fine entity typing by using a BERT Masked Language Model (MLM). Given a mention in a sentence, our approach constructs an input for the BERT MLM so that it predicts context dependent hypernyms of the mention, which can be used as type labels. Experimental results demonstrate that, with the help of these automatically generated labels, the performance of an ultra-fine entity typing model can be improved substantially. We also show that our approach can be applied to improve traditional fine-grained entity typing after performing simple type mapping.
翻译:最近,有人努力通过使用更丰富和超细的一组类型来扩大细微实体打字,并给包括代名词和名义名词而不是仅点名实体提到的名词的名词贴上标签,以扩大微细实体打字。这个超细实体打字任务的一项关键挑战是,人类附加说明的数据极为稀缺,而现有的遥远或薄弱监督方法的批注能力非常有限。为了解决这个问题,我们在本文件中提议通过使用BERT蒙面语言模型(MLM)来获取超细实体打字的培训数据。考虑到一句中的提法,我们的方法为BERT MLM构建了一个输入,以便它预测出引用的上下文取决于上下文的高音,可以用作类型标签。实验结果表明,在这些自动生成的标签的帮助下,超细实体打字模型的性能可以大大改善。我们还表明,在进行简单类型绘图后,我们的方法可以用来改进传统的精细实体打字的格式打字。