In recent years, misinformation on the Web has become increasingly rampant. The research community has responded by proposing systems and challenges, which are beginning to be useful for (various subtasks of) detecting misinformation. However, most proposed systems are based on deep learning techniques which are fine-tuned to specific domains, are difficult to interpret and produce results which are not machine readable. This limits their applicability and adoption as they can only be used by a select expert audience in very specific settings. In this paper we propose an architecture based on a core concept of Credibility Reviews (CRs) that can be used to build networks of distributed bots that collaborate for misinformation detection. The CRs serve as building blocks to compose graphs of (i) web content, (ii) existing credibility signals --fact-checked claims and reputation reviews of websites--, and (iii) automatically computed reviews. We implement this architecture on top of lightweight extensions to Schema.org and services providing generic NLP tasks for semantic similarity and stance detection. Evaluations on existing datasets of social-media posts, fake news and political speeches demonstrates several advantages over existing systems: extensibility, domain-independence, composability, explainability and transparency via provenance. Furthermore, we obtain competitive results without requiring finetuning and establish a new state of the art on the Clef'18 CheckThat! Factuality task.
翻译:近些年来,网上的错误信息越来越猖獗。研究界的反应是提出系统和挑战,这些系统和挑战开始对发现错误信息有用(各种子任务),然而,大多数拟议的系统都是基于深层次的学习技术,这些技术经过对特定领域进行微调,很难解释和产生无法机器读懂的结果。这限制了其适用和采用,因为只有特定专家受众才能在非常具体的环境中使用这些系统和挑战。在本文件中,我们提议了一个基于信誉审查核心概念的架构,可用于建立分布式机器人网络,以协作发现错误信息。公司作为构筑(一) 网络内容图的构件,(二) 现有信誉信号 -- -- 核对过要求和网站声誉审查 -- 和(三) 自动计算。我们在Schema.org的轻度扩展顶部和提供通用NLP任务的服务,以便进行语义相似性和姿态探测。对现有的社会媒体文章、假新闻和政治演讲的评价显示了若干优点,以构建了(一) 网络内容的图表,(二) 网络内容,(二) 现有信誉信号 -- -- 核实索赔和声誉审查-