The recent introduction of Transformers language representation models allowed great improvements in many natural language processing (NLP) tasks. However, if on one hand the performances achieved by this kind of architectures are surprising, on the other their usability is limited by the high number of parameters which constitute their network, resulting in high computational and memory demands. In this work we present BERTino, a DistilBERT model which proposes to be the first lightweight alternative to the BERT architecture specific for the Italian language. We evaluated BERTino on the Italian ISDT, Italian ParTUT, Italian WikiNER and multiclass classification tasks, obtaining F1 scores comparable to those obtained by a BERTBASE with a remarkable improvement in training and inference speed.
翻译:近来引入的Transformers语言表示模型在许多自然语言处理(NLP)任务中取得了巨大的改进。然而,一方面这种体系结构所取得的性能令人惊讶,另一方面,它们的可用性受到构成其网络的大量参数的限制,导致计算和内存需求很高。在这项工作中,我们介绍了BERTino,这是一种DistilBERT模型,它提出成为BERT体系结构的第一轻量级替代品,专门用于意大利语言。我们在多类分类任务上评估了BERTino,包括在ISDT, ParTUT和WikiNER数据集上的意大利语,获得了与BERTBASE相当的F1分数,同时在训练和推理速度方面有显着的改善。