Recent years have witnessed the proliferation of fake news, propaganda, misinformation, and disinformation online. While initially this was mostly about textual content, over time images and videos gained popularity, as they are much easier to consume, attract much more attention, and spread further than simple text. As a result, researchers started targeting different modalities and combinations thereof. As different modalities are studied in different research communities, with insufficient interaction, here we offer a survey that explores the state-of-the-art on multimodal disinformation detection covering various combinations of modalities: text, images, audio, video, network structure, and temporal information. Moreover, while some studies focused on factuality, others investigated how harmful the content is. While these two components in the definition of disinformation -- (i) factuality and (ii) harmfulness, are equally important, they are typically studied in isolation. Thus, we argue for the need to tackle disinformation detection by taking into account multiple modalities as well as both factuality and harmfulness, in the same framework. Finally, we discuss current challenges and future research directions.
翻译:近几年来,假新闻、宣传、误导和假信息在网上大量出现。虽然最初这主要涉及文字内容,但随着时间的推移,图像和视频越来越受欢迎,因为其消费容易,吸引更多的关注,而且比简单的文字更加普及。结果,研究人员开始以不同的方式和组合为目标。由于不同研究群体研究不同的方式,互动不足,因此我们在这里进行一项调查,探讨多式联运虚假信息探测的最新技术,涵盖各种模式的组合:文字、图像、音频、视频、网络结构和时间信息。此外,一些研究侧重于事实质量,而另一些研究则调查了内容的危害程度。虽然虚假信息定义中的这两个组成部分 -- -- (一) 事实质量和(二) 有害性 -- -- 同样重要,但它们通常被孤立地研究。因此,我们主张需要在同一框架内,考虑到多种模式以及事实质量和有害性,处理错误信息探测问题。最后,我们讨论了当前的挑战和未来研究方向。