The rapid development of social media provides a hotbed for the dissemination of fake news, which misleads readers and causes negative effects on society. News usually involves texts and images to be more vivid. Consequently, multi-modal fake news detection has received wide attention. Prior efforts primarily conduct multi-modal fusion by simple concatenation or co-attention mechanism, leading to sub-optimal performance. In this paper, we propose a novel mutual learning network based model MMNet, which enhances the multi-modal fusion for fake news detection via mutual learning between text- and vision-centered views towards the same classification objective. Specifically, we design two detection modules respectively based on text- and vision-centered multi-modal fusion features, and enable the mutual learning of the two modules to facilitate the multi-modal fusion, considering the latent consistency between the two modules towards the same training objective. Moreover, we also consider the influence of the image-text matching degree on news authenticity judgement by designing an image-text matching aware co-attention mechanism for multi-modal fusion. Extensive experiments are conducted on three benchmark datasets and the results demonstrate that our proposed MMNet achieves superior performance in fake news detection.
翻译:社交媒体的迅速发展为传播假新闻提供了温床,误导读者,对社会造成负面影响。新闻通常涉及文字和图像,更生动。因此,多式假新闻探测受到广泛关注。先前的努力主要是通过简单的融合或共同关注机制进行多式融合,导致业绩不尽人意。在本文中,我们提议基于MMNet模式的新颖的相互学习网络模式,通过对同一分类目标的文本和视觉观点观点进行相互学习,加强假新闻探测的多式聚合。具体地说,我们分别根据文本和以愿景为中心的多式融合功能设计了两个检测模块,并使得两个模块能够相互学习,以促进多式融合,同时考虑到两个模块之间对同一培训目标的潜在一致性。此外,我们还考虑图像文本匹配度对新闻真实性判断的影响,为多式融合设计一个图像文本匹配的有意识的共同保存机制。在三个基准数据集上进行了广泛的实验,并使得两个模块能够相互学习,以促进多式融合,考虑到两个模块之间对同一培训目标的潜在一致性。此外,我们还考虑图像文本匹配度对新闻真实性判断的影响,为多式混合设计一个有意识的共享机制。在三个基准数据集上进行了广泛的实验,并展示了我们模拟检测结果。