Online abusive language detection (ALD) has become a societal issue of increasing importance in recent years. Several previous works in online ALD focused on solving a single abusive language problem in a single domain, like Twitter, and have not been successfully transferable to the general ALD task or domain. In this paper, we introduce a new generic ALD framework, MACAS, which is capable of addressing several types of ALD tasks across different domains. Our generic framework covers multi-aspect abusive language embeddings that represent the target and content aspects of abusive language and applies a textual graph embedding that analyses the user's linguistic behaviour. Then, we propose and use the cross-attention gate flow mechanism to embrace multiple aspects of abusive language. Quantitative and qualitative evaluation results show that our ALD algorithm rivals or exceeds the six state-of-the-art ALD algorithms across seven ALD datasets covering multiple aspects of abusive language and different online community domains.
翻译:近些年来,在线虐待性语言探测(ALD)已成为一个日益重要的社会问题。以前在网上的ALD的一些工作侧重于解决像Twitter这样的单一域的单一滥用语言问题,没有成功地将它转移到通用的ALD任务或域。在本文中,我们引入了一个新的通用的ALD框架,MACAS, 它能够处理不同领域的多种类型的ALD任务。我们的通用框架包括多层虐待性语言嵌入,它代表了滥用性语言的目标和内容方面,并应用了一个文本图嵌入,分析用户的语言行为。然后,我们提出并使用交叉注意门流机制,以涵盖滥用性语言的多个方面。定量和定性的评估结果表明,我们的ALD算法在七个ALD数据集中与六个最先进的ALD算法相竞争或超过了六个最先进的ALD算法,涵盖虐待性语言的多个方面和不同的在线社区领域。