The RuNNE Shared Task approaches the problem of nested named entity recognition. The annotation schema is designed in such a way, that an entity may partially overlap or even be nested into another entity. This way, the named entity "The Yermolova Theatre" of type "organization" houses another entity "Yermolova" of type "person". We adopt the Russian NEREL dataset for the RuNNE Shared Task. NEREL comprises news texts written in the Russian language and collected from the Wikinews portal. The annotation schema includes 29 entity types. The nestedness of named entities in NEREL reaches up to six levels. The RuNNE Shared Task explores two setups. (i) In the general setup all entities occur more or less with the same frequency. (ii) In the few-shot setup the majority of entity types occur often in the training set. However, some of the entity types are have lower frequency, being thus challenging to recognize. In the test set the frequency of all entity types is even. This paper reports on the results of the RuNNE Shared Task. Overall the shared task has received 156 submissions from nine teams. Half of the submissions outperform a straightforward BERT-based baseline in both setups. This paper overviews the shared task setup and discusses the submitted systems, discovering meaning insights for the problem of nested NER. The links to the evaluation platform and the data from the shared task are available in our github repository: https://github.com/dialogue-evaluation/RuNNE.
翻译:RuNNE 共享任务处理嵌入命名实体的识别问题。 批注计划的设计方式是, 一个实体可以部分重叠, 甚至嵌入另一个实体。 这样, 名为“ 组织” 的实体“ Yermolova 剧场”, 包含另一个“ 人” 类型的实体“ Yermolova ” 。 我们为 RuNNE 共享任务采用了俄罗斯 NEREL 数据集。 NEREL 包含以俄文撰写的、 从 Wikinews 门户网站收集的新闻文本。 批注计划包括29个实体类型。 NEREL 中命名实体的嵌入性达到6个层次。 RuNERlova 共享任务探索了两个设置。 (一) 在一般设置中, 所有实体都或多或少以相同频率出现。 但是, 一些实体类型中的频率较低, 因而难以识别。 在测试中, 所有实体类型的频率是 甚至。 RuNEURE 共享任务 共享任务中, 共同任务中包含 RuNNE NE 共享任务 任务 共享任务 和共享任务中 共享任务 共享任务的版本 。 共享任务 共享任务集 。 共同任务中, 共同任务 共同任务中 共有任务 共有任务 共有任务 。 共任务 共有任务集 共任务 。 共任务集 共任务 共任务 共任务 共任务 共任务 共任务 共任务 共任务 共任务 。