In recent years, the problem of rumours on online social media (OSM) has attracted lots of attention. Researchers have started investigating from two main directions. First is the descriptive analysis of rumours and secondly, proposing techniques to detect (or classify) rumours. In the descriptive line of works, where researchers have tried to analyse rumours using NLP approaches, there isnt much emphasis on psycho-linguistics analyses of social media text. These kinds of analyses on rumour case studies are vital for drawing meaningful conclusions to mitigate misinformation. For our analysis, we explored the PHEME9 rumour dataset (consisting of 9 events), including source tweets (both rumour and non-rumour categories) and response tweets. We compared the rumour and nonrumour source tweets and then their corresponding reply (response) tweets to understand how they differ linguistically for every incident. Furthermore, we also evaluated if these features can be used for classifying rumour vs. non-rumour tweets through machine learning models. To this end, we employed various classical and ensemble-based approaches. To filter out the highly discriminative psycholinguistic features, we explored the SHAP AI Explainability tool. To summarise, this research contributes by performing an in-depth psycholinguistic analysis of rumours related to various kinds of events.
翻译:近年来,网上社交媒体(OSM)的传闻问题引起了人们的极大关注。研究人员已开始从两个主要方向进行调查。首先,对传闻进行描述性分析,其次,提出发现(或分类)传闻的技术。在描述性工作方面,研究人员试图使用NLP方法分析传闻,但并没有多少强调对社交媒体文本的心理语言分析。关于传闻案例研究的这类分析对于通过机器学习模型得出有意义的结论以减少错误信息至关重要。我们的分析是探索PHEME9传闻数据集(包含9个事件),包括源推文(谣言和非反调类别)和回应性推文。我们比较了传闻和非反调源推文的推文,随后又比较了相应的回文(回应)推文,以了解每一起事件在语言上的差异。此外,我们还评估了这些特征是否可以用来通过机器学习模型对谣言与非反调的推文进行分类。我们为此采用了各种古典和共构性方法。我们用各种分析方法来筛选高度具有歧视性的心理语言特征的推理学特征,我们探索了这种分析。