The era of transfer learning has revolutionized the fields of Computer Vision and Natural Language Processing, bringing powerful pretrained models with exceptional performance across a variety of tasks. Specifically, Natural Language Processing tasks have been dominated by transformer-based language models. In Natural Language Inference and Natural Language Generation tasks, the BERT model and its variants, as well as the GPT model and its successors, demonstrated exemplary performance. However, the majority of these models are pretrained and assessed primarily for the English language or on a multilingual corpus. In this paper, we introduce GreekBART, the first Seq2Seq model based on BART-base architecture and pretrained on a large-scale Greek corpus. We evaluate and compare GreekBART against BART-random, Greek-BERT, and XLM-R on a variety of discriminative tasks. In addition, we examine its performance on two NLG tasks from GreekSUM, a newly introduced summarization dataset for the Greek language. The model, the code, and the new summarization dataset will be publicly available.
翻译:转移学习的时代彻底改变了计算机视觉和自然语言处理等领域,带来了功能强大的预训练模型,能在各种任务上表现出色。具体而言,变压器模型主导了自然语言处理任务。在自然语言推理和自然语言生成任务中,BERT模型及其变体以及GPT模型及其后继模型表现出色。然而,这些模型大多是针对英语语言进行预训练和评估,或者是在多语料库上进行的。在本文中,我们介绍了GreekBART,这是一个基于BART-base架构的Seq2Seq模型,预训练于大规模的希腊语语料库上。我们评估和比较了GreekBART与BART-random、Greek-BERT和XLM-R在各种判别式任务上的表现。此外,我们还研究了它在希腊语SUM的两个NLG任务上的表现,这是一个刚刚推出的用于希腊语的摘要数据集。该模型、代码和新的摘要数据集将公开提供。