With the fast growth of mobile computing and Web technologies, offensive language has become more prevalent on social networking platforms. Since offensive language identification in local languages is essential to moderate the social media content, in this paper we work with three Dravidian languages, namely Malayalam, Tamil, and Kannada, that are under-resourced. We present an evaluation task at FIRE 2020- HASOC-DravidianCodeMix and DravidianLangTech at EACL 2021, designed to provide a framework for comparing different approaches to this problem. This paper describes the data creation, defines the task, lists the participating systems, and discusses various methods.
翻译:随着移动计算和网络技术的快速增长,攻击性语言在社交网络平台上越来越普遍,因为用当地语言识别攻击性语言对于缓和社交媒体内容至关重要,因此在本文件中,我们用三种德拉维迪亚语言,即资源不足的马拉亚拉姆语、泰米尔语和坎纳达语开展工作,我们在2020年FIRE-HasOC-DravidianCodeMix 和 DravidianLangTech 的EACL 2021 上提出了评估任务,目的是提供一个框架,用于比较对这一问题的不同方法。本文描述了数据创建、定义任务、列出参与系统并讨论各种方法。