We propose a new global entity disambiguation (ED) model based on contextualized embeddings of words and entities. Our model is based on BERT and trained with our new training task, which enables the model to capture both the word-based local and entity-based global contextual information. The model solves ED as a sequence decision task and effectively uses both types of contextual information. We achieve new state-of-the-art results on five standard ED datasets: AIDA-CoNLL, MSNBC, AQUAINT, ACE2004, and WNED-WIKI. Our source code and trained model checkpoint are available at https://github.com/studio-ousia/luke.
翻译:我们根据文字和实体的背景嵌入,提出了一个新的全球实体脱节模式(ED),我们的模式以BERT为基础,并经过我们新的培训任务的培训,使该模式能够捕捉基于字的地方和基于实体的全球背景信息。该模式解决了ED,作为一项顺序决定任务,并有效利用了两种类型的背景信息。我们在五个标准的ED数据集(AIDA-CONLL、MSNBC、AQUAINT、ACE2004和WNED-WIKI)上取得了新的最新结果。我们的源代码和经过培训的模型检查站可在https://github.com/studio-ousia/luke上查阅。