In this paper, we tackle the Arabic Fine-Grained Hate Speech Detection shared task and demonstrate significant improvements over reported baselines for its three subtasks. The tasks are to predict if a tweet contains (1) Offensive language; and whether it is considered (2) Hate Speech or not and if so, then predict the (3) Fine-Grained Hate Speech label from one of six categories. Our final solution is an ensemble of models that employs multitask learning and a self-consistency correction method yielding 82.7% on the hate speech subtask -- reflecting a 3.4% relative improvement compared to previous work.
翻译:在本文中,我们处理阿拉伯美分仇恨言论探测共享的任务,并展示了与报告的三个子任务基准相比的重大改进。 任务在于预测推文是否包含(1) 攻击性语言;以及是否被视为(2) 仇恨性言论,如果是的话,则预测六大类之一的(3) 美分仇恨言论标签。 我们的最终解决方案是采用多任务学习和自一致校正方法的一系列模型,在仇恨言论子任务上产生82.7%的收益 -- -- 与以前的工作相比,相对改进了3.4%。