Among the various modes of communication in social media, the use of Internet memes has emerged as a powerful means to convey political, psychological, and socio-cultural opinions. Although memes are typically humorous in nature, recent days have witnessed a proliferation of harmful memes targeted to abuse various social entities. As most harmful memes are highly satirical and abstruse without appropriate contexts, off-the-shelf multimodal models may not be adequate to understand their underlying semantics. In this work, we propose two novel problem formulations: detecting harmful memes and the social entities that these harmful memes target. To this end, we present HarMeme, the first benchmark dataset, containing 3,544 memes related to COVID-19. Each meme went through a rigorous two-stage annotation process. In the first stage, we labeled a meme as very harmful, partially harmful, or harmless; in the second stage, we further annotated the type of target(s) that each harmful meme points to: individual, organization, community, or society/general public/other. The evaluation results using ten unimodal and multimodal models highlight the importance of using multimodal signals for both tasks. We further discuss the limitations of these models and we argue that more research is needed to address these problems.
翻译:在社交媒体的各种交流方式中,互联网Memes的使用已成为传递政治、心理和社会文化观点的有力手段,互联网Memes已成为传递政治、心理和社会文化观点的有力手段。虽然互联网Memes通常具有幽默性,但最近几天,有害Memes大量涌现,目的是滥用各种社会实体。由于大多数有害Memes在没有适当背景的情况下是高度讽刺和禁食的,现成的多式联运模式可能不足以理解其基本语义。在这项工作中,我们提出了两个新颖的问题提法:发现有害Memes和这些有害Memes所针对的社会实体。为此,我们介绍了第一个基准数据集HarMeme(HarMeme),其中包含与COVID-19有关的3,544Memes。每个Meme都经历了严格的两阶段注解过程。在第一阶段,我们将一个Memesme称为非常有害、部分有害或无害的;在第二阶段,我们进一步说明每个有害Memes(s)的目标类型是:个人、组织、社区或社会/一般大众/其他人。我们用10个单式和多式模型进行的评价结果,我们用10个单式和多式模型来讨论这些模型来强调这些多式联运问题的重要性。