The prevalence and perniciousness of fake news has been a critical issue on the Internet, which stimulates the development of automatic fake news detection in turn. In this paper, we focus on the evidence-based fake news detection, where several evidences are utilized to probe the veracity of news (i.e., a claim). Most previous methods first employ sequential models to embed the semantic information and then capture the claim-evidence interaction based on different attention mechanisms. Despite their effectiveness, they still suffer from two main weaknesses. Firstly, due to the inherent drawbacks of sequential models, they fail to integrate the relevant information that is scattered far apart in evidences for veracity checking. Secondly, they neglect much redundant information contained in evidences that may be useless or even harmful. To solve these problems, we propose a unified Graph-based sEmantic sTructure mining framework, namely GET in short. Specifically, different from the existing work that treats claims and evidences as sequences, we model them as graph-structured data and capture the long-distance semantic dependency among dispersed relevant snippets via neighborhood propagation. After obtaining contextual semantic information, our model reduces information redundancy by performing graph structure learning. Finally, the fine-grained semantic representations are fed into the downstream claim-evidence interaction module for predictions. Comprehensive experiments have demonstrated the superiority of GET over the state-of-the-arts.
翻译:假新闻的普遍性和恶劣性一直是互联网上的一个关键问题,它刺激了自动假新闻探测的开发。在本文中,我们侧重于基于证据的假新闻探测,其中利用了若干证据来探探探新闻的真实性(即索赔)。大多数先前的方法首先使用顺序模型来嵌入语义信息,然后根据不同的关注机制捕捉索赔证据互动。尽管它们的效力不同,但它们仍然有两个主要弱点。首先,由于相继模型固有的缺陷,它们未能将分散在真实性检查证据中的相关信息整合起来。第二,它们忽略了证据中许多多余的、可能无用甚至有害的信息。为了解决这些问题,我们提出了一个统一的基于图表的语义结构采矿框架,即短短。具体地说,它们不同于将索赔和证据作为顺序处理的现有工作,我们把它们建为图表数据,并捕捉到通过社区精确传播分散的相关平板块之间的长距离语义依赖性。第二,它们忽略了证据中包含的许多多余的信息,而这些证据可能毫无用处,甚至有害。为了解决这些问题,我们提出了统一的基于图表的语义结构结构结构,我们用模型来进行模拟的图像化的模拟模拟,从而将数据转换为图表。