具有专题和深变式模型的可解释假冒新闻探测 (Interpretable Fake News Detection with Topic and Deep Variational Models)

The growing societal dependence on social media and user generated content for news and information has increased the influence of unreliable sources and fake content, which muddles public discourse and lessens trust in the media. Validating the credibility of such information is a difficult task that is susceptible to confirmation bias, leading to the development of algorithmic techniques to distinguish between fake and real news. However, most existing methods are challenging to interpret, making it difficult to establish trust in predictions, and make assumptions that are unrealistic in many real-world scenarios, e.g., the availability of audiovisual features or provenance. In this work, we focus on fake news detection of textual content using interpretable features and methods. In particular, we have developed a deep probabilistic model that integrates a dense representation of textual news using a variational autoencoder and bi-directional Long Short-Term Memory (LSTM) networks with semantic topic-related features inferred from a Bayesian admixture model. Extensive experimental studies with 3 real-world datasets demonstrate that our model achieves comparable performance to state-of-the-art competing models while facilitating model interpretability from the learned topics. Finally, we have conducted model ablation studies to justify the effectiveness and accuracy of integrating neural embeddings and topic features both quantitatively by evaluating performance and qualitatively through separability in lower dimensional embeddings.

翻译：社会日益依赖社交媒体和用户生成的新闻和信息内容,增加了不可靠来源和虚假内容的影响,混淆了公共言论,削弱了对媒体的信任。验证这些信息的可信度是一项困难的任务,容易证实偏见,导致发展算法技术,以区分假新闻和真实新闻。然而,大多数现有方法都具有挑战性,难以解释,难以建立对预测的信任,也难以作出在许多现实世界情景中不切实际的假设,例如视听特征或出处的可得性。在这项工作中,我们侧重于利用可解释的特征和方法对文本内容进行假新闻检测。特别是,我们开发了一种深度的概率模型,利用变式自动编码和双向短期记忆(LSTM)网络以及从Bayesian adixturt 模型推断的语义主题相关特征,从而难以建立信任。与3个真实世界数据集进行的广泛实验研究表明,我们模型的性能与更低层次的模型相似,在可解释性模型和定性模型的精确性方面,我们开发了一个深度的模型,同时通过从所学的专题性研究,将模型的准确性与定性加以整合。最后,从所学专题中,将模型的精确性加以解释。

相关内容

MoDELS

关注 43

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

【NUS-Xavier 教授】图神经网络应用概述，15页ppt

专知会员服务

52+阅读 · 2021年6月30日

剑桥大学《数据科学: 原理与实践》课程，附PPT下载

专知会员服务

53+阅读 · 2021年1月20日