In this paper, we present the first Entity Linking corpus for Icelandic. We describe our approach of using a multilingual entity linking model (mGENRE) in combination with Wikipedia API Search (WAPIS) to label our data and compare it to an approach using WAPIS only. We find that our combined method reaches 53.9% coverage on our corpus, compared to 30.9% using only WAPIS. We analyze our results and explain the value of using a multilingual system when working with Icelandic. Additionally, we analyze the data that remain unlabeled, identify patterns and discuss why they may be more difficult to annotate.
翻译:在本文中,我们介绍冰岛的第一个实体链接程序。我们描述了我们使用多语种连接模式(mGENRE)与维基百科API搜索(WAPIS)结合使用多语种实体链接模式(MGENRE)来标记我们的数据并将其与仅使用WAPIS的方法进行比较的方法。我们发现,我们的综合方法在我们体内达到53.9%的覆盖率,而仅使用WAPIS的覆盖率为30.9%。我们分析了我们的结果,并解释了在与冰岛合作时使用多语种系统的价值。此外,我们分析了尚未贴上标签的数据,查明了模式,并讨论了为什么它们可能更难作注解。