Cyberbullying is a prevalent and growing social problem due to the surge of social media technology usage. Minorities, women, and adolescents are among the common victims of cyberbullying. Despite the advancement of NLP technologies, the automated cyberbullying detection remains challenging. This paper focuses on advancing the technology using state-of-the-art NLP techniques. We use a Twitter dataset from SemEval 2019 - Task 5(HatEval) on hate speech against women and immigrants. Our best performing ensemble model based on DistilBERT has achieved 0.73 and 0.74 of F1 score in the task of classifying hate speech (Task A) and aggressiveness and target (Task B) respectively. We adapt the ensemble model developed for Task A to classify offensive language in external datasets and achieved ~0.7 of F1 score using three benchmark datasets, enabling promising results for cross-domain adaptability. We conduct a qualitative analysis of misclassified tweets to provide insightful recommendations for future cyberbullying research.
翻译:由于社交媒体技术的使用激增,网络欺凌是一个普遍和日益加剧的社会问题。少数群体、妇女和青少年是网络欺凌的常见受害者。尽管NLP技术不断进步,自动化网络欺凌探测仍然具有挑战性。本文件侧重于利用最新NLP技术推进技术。我们使用SemEval 2019-关于针对妇女和移民的仇恨言论的第5任务(HatEval)的Twitter数据集。我们基于DistilBERT的最好的组合模型在对仇恨言论(Task A)和攻击性和目标(Task B)进行分类的任务中分别达到F1分的0.73和0.74分。我们调整了为任务A开发的集合模型,在外部数据集中对攻击性语言进行分类,并利用三个基准数据集实现了F1分的~0.7分。我们对错误分类的推文模型进行了定性分析,以便为未来的网络欺凌研究提供有洞察的建议。