We present SwissBERT, a masked language model created specifically for processing Switzerland-related text. SwissBERT is a pre-trained model that we adapted to news articles written in the national languages of Switzerland -- German, French, Italian, and Romansh. We evaluate SwissBERT on natural language understanding tasks related to Switzerland and find that it tends to outperform previous models on these tasks, especially when processing contemporary news and/or Romansh Grischun. Since SwissBERT uses language adapters, it may be extended to Swiss German dialects in future work. The model and our open-source code are publicly released at https://github.com/ZurichNLP/swissbert.
翻译:我们提出了 SwissBERT,这是一个专门用于处理于瑞士相关文本的掩码语言模型。SwissBERT 是一个预训练的模型,我们将其调整为适用于瑞士国家语言 - 德语、法语、意大利语和罗曼什语的新闻文章。我们在与瑞士有关的自然语言理解任务上评估了 SwissBERT,发现它在这些任务上往往优于先前的模型,特别是在处理当代新闻和/或罗曼什语 Grischun 时。由于 SwissBERT 使用语言适配器,因此它可以在未来的工作中扩展到瑞士德语方言。该模型及我们的开源代码已发布在 https://github.com/ZurichNLP/swissbert。