In this paper, a BERT based neural network model is applied to the JIGSAW data set in order to create a model identifying hateful and toxic comments (strictly seperated from offensive language) in online social platforms (English language), in this case Twitter. Three other neural network architectures and a GPT-2 model are also applied on the provided data set in order to compare these different models. The trained BERT model is then applied on two different data sets to evaluate its generalisation power, namely on another Twitter data set and the data set HASOC 2019 which includes Twitter and also Facebook comments; we focus on the English HASOC 2019 data. In addition, it can be shown that by fine-tuning the trained BERT model on these two data sets by applying different transfer learning scenarios via retraining partial or all layers the predictive scores improve compared to simply applying the model pre-trained on the JIGSAW data set. With our results, we get precisions from 64% to around 90% while still achieving acceptable recall values of at least lower 60s%, proving that BERT is suitable for real use cases in social platforms.
翻译:在本文中,基于BERT的神经网络模型适用于JIGSAW数据集,目的是在网上社交平台(英语)上建立一个模型,识别仇恨和有毒的评论(严格地与攻击性语言分离),这里使用的是Twitter。另外三个神经网络架构和一个GPT-2模型也应用在所提供的数据集上,以比较这些不同的模型。然后,经过培训的BERT模型应用在两个不同的数据集上,以评估其概括性能力,即在另一个Twitter数据集和HasOC 2019数据集上,包括Twitter和脸书评论;我们侧重于英语HasOC 2019数据。此外,通过通过部分或所有层次的再培训,对经过培训的BERT模型模型进行微调,可以证明通过应用不同的转移学习情景,预测分数的改进与仅仅应用在JIGSAWAW数据集上预先培训的模型相比,我们得到的精确度从64%到大约90%,同时仍然达到至少60 %的可接受的回忆值,证明BERTERT适合在社会平台上实际使用的案例。