Recent years have witnessed the proliferation of offensive content online such as fake news, propaganda, misinformation, and disinformation. While initially this was mostly about textual content, over time images and videos gained popularity, as they are much easier to consume, attract more attention, and spread further than text. As a result, researchers started leveraging different modalities and combinations thereof to tackle online multimodal offensive content. In this study, we offer a survey on the state-of-the-art on multimodal disinformation detection covering various combinations of modalities: text, images, speech, video, social media network structure, and temporal information. Moreover, while some studies focused on factuality, others investigated how harmful the content is. While these two components in the definition of disinformation (i) factuality, and (ii) harmfulness, are equally important, they are typically studied in isolation. Thus, we argue for the need to tackle disinformation detection by taking into account multiple modalities as well as both factuality and harmfulness, in the same framework. Finally, we discuss current challenges and future research directions
翻译:近些年来,网上的虚假新闻、宣传、错误信息和假信息等冒犯性内容激增,虽然最初主要是关于文字内容,但随着时间的流逝,图像和视频越来越受欢迎,因为它们更容易消费、吸引更多的关注,而且比文字更加普及。结果,研究人员开始利用不同的方式和组合来处理在线多式联运攻击性内容。在这项研究中,我们提供了一份关于多式联运虚假信息探测最新技术的调查,涵盖各种模式的组合:文字、图像、言论、视频、社交媒体网络结构和时间信息。此外,一些研究侧重于事实,而另一些则调查了内容的危害性。虽然在虚假信息定义(一)事实质量和(二)有害性这两个组成部分同样重要,但通常孤立地加以研究。因此,我们主张需要在同一框架内,考虑到多种模式以及事实质量和有害性,处理错误信息检测问题。我们讨论了当前的挑战和未来研究方向。