This document sums up our results forthe NLP lecture at ETH in the spring semester 2021. In this work, a BERT based neural network model (Devlin et al.,2018) is applied to the JIGSAW dataset (Jigsaw/Conversation AI, 2019) in order to create a model identifying hateful and toxic comments (strictly seperated from offensive language) in online social platforms (English language), inthis case Twitter. Three other neural network architectures and a GPT-2 (Radfordet al., 2019) model are also applied on the provided data set in order to compare these different models. The trained BERT model is then applied on two different data sets to evaluate its generalisation power, namely on another Twitter data set (Tom Davidson, 2017) (Davidsonet al., 2017) and the data set HASOC 2019 (Thomas Mandl, 2019) (Mandl et al.,2019) which includes Twitter and also Facebook comments; we focus on the English HASOC 2019 data. In addition, it can be shown that by fine-tuning the trained BERT model on these two datasets by applying different transfer learning scenarios via retraining partial or all layers the predictive scores improve compared to simply applying the model pre-trained on the JIGSAW data set. Withour results, we get precisions from 64% to around 90% while still achieving acceptable recall values of at least lower 60s%, proving that BERT is suitable for real usecases in social platforms.
翻译:本文总结了2021年春季在ETH举行的NLP讲座的结果。 在这项工作中,基于BERT的神经网络模型(Devlin等人,2018年)应用到JIGSAW数据集(Jigsaw/Converation AI,2019年),以便建立一个模型,在网上社交平台(英语)中识别仇恨和有毒的评论(严格与攻击性语言分离),在Twitter案中。另外三个神经网络架构和一个GPT-2(Radfordet al.,2019年)模型也应用到所提供的数据集中,以便比较这些不同的模型。此外,经过培训的BERT模型应用两个不同的数据集来评价其总体化能力,即另一套Twitter数据集(Tom Davidson,2017年)(Davidsonet al.,2017年)和数据集(HusOC 2019年(Thoomas Mandl等人,2019年),该数据集包括Twitter和Facebook平台;我们侧重于英语 HAHO 2019数据。此外,我们注重于这些不同的模型中的两个。此外,通过微调化的BERTERTERD模型来测试这些模型, 90级数据。我们只是用这些模型来改进了这些模型。