In this study, we aimed to address the growing concern of trolling behavior on social media by developing and evaluating a set of model architectures for the automatic detection of troll tweets. Utilizing deep learning techniques and pre-trained word embedding methods such as BERT, ELMo, and GloVe, we evaluated the performance of each architecture using metrics such as classification accuracy, F1 score, AUC, and precision. Our results indicate that BERT and ELMo embedding methods performed better than the GloVe method, likely due to their ability to provide contextualized word embeddings that better capture the nuances and subtleties of language use in online social media. Additionally, we found that CNN and GRU encoders performed similarly in terms of F1 score and AUC, suggesting their effectiveness in extracting relevant information from input text. The best-performing method was found to be an ELMo-based architecture that employed a GRU classifier, with an AUC score of 0.929. This research highlights the importance of utilizing contextualized word embeddings and appropriate encoder methods in the task of troll tweet detection, which can assist social-based systems in improving their performance in identifying and addressing trolling behavior on their platforms.
翻译:在这项研究中,我们的目标是通过开发和评价一套用于自动检测微推推的模型架构,解决日益关注社交媒体中诱骗行为的问题。我们利用深层次的学习技巧和预先训练的字嵌入方法,如BERT、ELMO和GloVe,利用分类精确度、F1评分、AUC和精确度等指标评估了每个架构的性能。我们的结果表明,BERT和ELMo嵌入方法的表现优于GloVe方法,可能是因为它们能够提供背景化的词嵌入方法,更好地捕捉在线社交媒体中语言使用的细微和微妙之处。此外,我们发现CNN和GRUCencoder在F1评分和AUC方面表现类似,表明它们在从输入文本中提取相关信息方面的有效性。发现,最佳方法是一种基于ELMO的架构,使用了GRU分类器,而AC得分为0.929。这一研究突出表明,必须使用背景化词嵌入和适当的编码方法,更好地捕捉摸微推文探测任务,这可以帮助社会化的系统改进其定位平台的操作。