In this paper we present our submission for the EACL 2021-Shared Task on Offensive Language Identification in Dravidian languages. Our final system is an ensemble of mBERT and XLM-RoBERTa models which leverage task-adaptive pre-training of multilingual BERT models with a masked language modeling objective. Our system was ranked 1st for Kannada, 2nd for Malayalam and 3rd for Tamil.
翻译:在本文中,我们提交了2021年欧洲-加勒比语言联盟关于以德拉维迪亚语言识别进攻性语言身份的2021年共同任务,我们的最后系统是混合的 mBERT 和 XLM-ROBERTA 模型,利用多语言BERT 模型的任务适应性预培训,并设定了蒙面语言模型的目标。我们的系统在Kannada排名第一,在Malayalam排名第二,在泰米尔排名第三。