Abusive language on online platforms is a major societal problem, often leading to important societal problems such as the marginalisation of underrepresented minorities. There are many different forms of abusive language such as hate speech, profanity, and cyber-bullying, and online platforms seek to moderate it in order to limit societal harm, to comply with legislation, and to create a more inclusive environment for their users. Within the field of Natural Language Processing, researchers have developed different methods for automatically detecting abusive language, often focusing on specific subproblems or on narrow communities, as what is considered abusive language very much differs by context. We argue that there is currently a dichotomy between what types of abusive language online platforms seek to curb, and what research efforts there are to automatically detect abusive language. We thus survey existing methods as well as content moderation policies by online platforms in this light, and we suggest directions for future work.
翻译:网上平台上的野蛮语言是一个重大的社会问题,往往导致重要的社会问题,如代表人数不足的少数群体边缘化。有许多不同形式的虐待语言,如仇恨言论、亵渎和网络欺凌,以及在线平台试图调节这种语言,以限制社会伤害,遵守立法,为用户创造一个更具包容性的环境。在自然语言处理领域,研究人员开发了不同的方法,自动发现滥用语言,往往侧重于具体的次级问题或狭隘的社区,因为人们所认为的虐待语言因背景而大不相同。我们争论说,目前,在滥用语言的在线平台寻求遏制的种类和为自动发现滥用语言而作的研究工作之间存在分歧。因此,我们从这个角度调查现有方法以及在线平台的内容调适政策,并为今后的工作提出方向。