We present a new corpus with coreference annotation, Russian Coreference Corpus (RuCoCo). The goal of RuCoCo is to obtain a large number of annotated texts while maintaining high inter-annotator agreement. RuCoCo contains news texts in Russian, part of which were annotated from scratch, and for the rest the machine-generated annotations were refined by human annotators. The size of our corpus is one million words and around 150,000 mentions. We make the corpus publicly available.
翻译:RuCoCo的目标是获得大量附加说明的文本,同时保持高份份份间协议。RuCoCo载有俄文新闻文本,其中一部分是从头到尾加注的,其余部分是由人类标注者对机器制作的注释加以改进的。我们这个机构的规模是100万个字,大约15万个字。我们公布这个文件。