We describe the winning submission to the CRAC 2022 Shared Task on Multilingual Coreference Resolution. Our system first solves mention detection and then coreference linking on the retrieved spans with an antecedent-maximization approach, and both tasks are fine-tuned jointly with shared Transformer weights. We report results of fine-tuning a wide range of pretrained models. The center of this contribution are fine-tuned multilingual models. We found one large multilingual model with sufficiently large encoder to increase performance on all datasets across the board, with the benefit not limited only to the underrepresented languages or groups of typologically relative languages. The source code is available at https://github.com/ufal/crac2022-corpipe.
翻译:我们描述向CRAC 2022多语言协作解析共同任务提交的中选文件。 我们的系统首先解决了在检索到的频谱上用前步- 最大化方法进行检测和连接的连接, 这两项任务都与共享的变异器权重进行了微调。 我们报告了一系列精练模型的微调结果。 这一贡献的核心是微调多语言模型。 我们发现一个拥有足够大编码的大型多语种模型, 能够提高所有数据集的性能, 其好处不仅限于代表不足的语言或类型相对语言组。 源代码可在 https://github.com/ufaal/crac2022-corpipop 上查阅。