The past decade has seen a substantial rise in the amount of mis- and disinformation online, from targeted disinformation campaigns to influence politics, to the unintentional spreading of misinformation about public health. This development has spurred research in the area of automatic fact checking, from approaches to detect check-worthy claims and determining the stance of tweets towards claims, to methods to determine the veracity of claims given evidence documents. These automatic methods are often content-based, using natural language processing methods, which in turn utilise deep neural networks to learn higher-order features from text in order to make predictions. As deep neural networks are black-box models, their inner workings cannot be easily explained. At the same time, it is desirable to explain how they arrive at certain decisions, especially if they are to be used for decision making. While this has been known for some time, the issues this raises have been exacerbated by models increasing in size, and by EU legislation requiring models to be used for decision making to provide explanations, and, very recently, by legislation requiring online platforms operating in the EU to provide transparent reporting on their services. Despite this, current solutions for explainability are still lacking in the area of fact checking. This thesis presents my research on automatic fact checking, including claim check-worthiness detection, stance detection and veracity prediction. Its contributions go beyond fact checking, with the thesis proposing more general machine learning solutions for natural language processing in the area of learning with limited labelled data. Finally, the thesis presents some first solutions for explainable fact checking.
翻译:过去十年来,在线错误和虚假信息的数量大幅上升,从有针对性的影响政治的虚假信息运动到无意中散布关于公共健康的错误信息。这一发展促进了自动事实检查领域的研究,从检测可核实索赔要求和确定推特对索赔要求的立场的方法,到确定索赔要求真实性的方法,这些自动方法往往以内容为基础,使用自然语言处理方法,利用深层神经网络从文字中学习更高层次的特征,以便作出预测。深神经网络是黑盒模型,因此其内部工作不易解释。与此同时,有必要解释它们如何达成某些决定,特别是如果这些决定用于决策的话。虽然人们已经知道这一点已有一段时间,但是由于模型规模的扩大,以及欧盟立法要求使用模型来做出解释,以及最近通过立法要求在线平台对其服务进行透明报告。尽管目前对内部工作的解决方案是黑盒模型,但它们的内部工作无法轻易解释。与此同时,有必要解释它们是如何达成某些决定的,特别是如果这些决定用于决策,特别是如果这些决定被使用的话。尽管这一点已经为人们所知道,但这种增加问题因模型的出现,但欧盟立法要求使用模型来作出解释解释解释解释,以便透明地报告其服务。尽管如此,但目前用于核对的解决方案的解决方案仍然缺乏核查领域,在进行事实检查方面缺乏实地检查。