This paper describes the training process of the first Czech monolingual language representation models based on BERT and ALBERT architectures. We pre-train our models on more than 340K of sentences, which is 50 times more than multilingual models that include Czech data. We outperform the multilingual models on 9 out of 11 datasets. In addition, we establish the new state-of-the-art results on nine datasets. At the end, we discuss properties of monolingual and multilingual models based upon our results. We publish all the pre-trained and fine-tuned models freely for the research community.
翻译:本文介绍捷克第一个单一语言语言代表模式的培训过程,该模式以BERT和ALBERT结构为基础。我们预先对340K以上句子的模式进行了培训,这比包含捷克数据的多语言模式高出50倍。我们在11个数据集中,有9个比多语言模式高。此外,我们还在9个数据集中建立了最新的最新结果。最后,我们根据我们的成果讨论了单一语言和多语言模式的特性。我们免费为研究界公布所有预先培训和经过精细调整的模式。