捷克变异器关于文本分类任务的比较 (Comparison of Czech Transformers on Text Classification Tasks)

In this paper, we present our progress in pre-training monolingual Transformers for Czech and contribute to the research community by releasing our models for public. The need for such models emerged from our effort to employ Transformers in our language-specific tasks, but we found the performance of the published multilingual models to be very limited. Since the multilingual models are usually pre-trained from 100+ languages, most of low-resourced languages (including Czech) are under-represented in these models. At the same time, there is a huge amount of monolingual training data available in web archives like Common Crawl. We have pre-trained and publicly released two monolingual Czech Transformers and compared them with relevant public models, trained (at least partially) for Czech. The paper presents the Transformers pre-training procedure as well as a comparison of pre-trained models on text classification task from various domains.

翻译：在本文中,我们介绍了捷克在培训前单一语言变换器方面的进展,并通过公布我们的公共模式为研究界作出贡献。这种模式之所以有必要,是因为我们努力利用变换器执行我们的语言特有任务,但我们发现出版的多语种模式的绩效非常有限。由于多语种模式通常以100+语言进行预先培训,因此大多数低资源语言(包括捷克语)在这些模式中的代表性不足。与此同时,像通用Crawl这样的网络档案库中有大量单一语言的培训数据。我们已经预先培训了两个单语捷克变换器,并公开公布了两个单语的捷克变换器,与相关公共模式进行了比较,为捷克人培训了相关的(至少部分)模式。本文介绍了变换器培训前程序,并比较了不同领域的文本分类任务培训前模式。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

最新《Transformers模型》教程，64页ppt

专知会员服务

320+阅读 · 2020年11月26日