Memes can sway people's opinions over social media as they combine visual and textual information in an easy-to-consume manner. Since memes instantly turn viral, it becomes crucial to infer their intent and potentially associated harmfulness to take timely measures as needed. A common problem associated with meme comprehension lies in detecting the entities referenced and characterizing the role of each of these entities. Here, we aim to understand whether the meme glorifies, vilifies, or victimizes each entity it refers to. To this end, we address the task of role identification of entities in harmful memes, i.e., detecting who is the 'hero', the 'villain', and the 'victim' in the meme, if any. We utilize HVVMemes - a memes dataset on US Politics and Covid-19 memes, released recently as part of the CONSTRAINT@ACL-2022 shared-task. It contains memes, entities referenced, and their associated roles: hero, villain, victim, and other. We further design VECTOR (Visual-semantic role dEteCToR), a robust multi-modal framework for the task, which integrates entity-based contextual information in the multi-modal representation and compare it to several standard unimodal (text-only or image-only) or multi-modal (image+text) models. Our experimental results show that our proposed model achieves an improvement of 4% over the best baseline and 1% over the best competing stand-alone submission from the shared-task. Besides divulging an extensive experimental setup with comparative analyses, we finally highlight the challenges encountered in addressing the complex task of semantic role labeling within memes.
翻译:Memes 能够通过社交媒体影响人们的观点,因为它们以易于消费的方式结合了视觉和文本信息。由于Memes瞬间风靡,因此推断它们的意图并可能与之相关的有害性就变得至关重要,以便及时采取必要的措施。与Memes理解相关的常见问题在于检测所引用的实体并表征每个这些实体的角色。在这里,我们的目标是了解Memes是否美化、诋毁或视为受害者它所涉及的每个实体。为此,我们解决了HarmfulMemes中实体角色识别的任务,即检测梗图中谁是“英雄”、“恶棍”和“受害者”,如果有的话。我们利用HVVMemes——最近发布的一份有关美国政治和Covid-19病毒的Memes数据集,作为ACL-2022联合任务的一部分。它包含Memes、引用的实体及其相关角色:英雄、恶棍、受害者和其他。我们进一步设计了VECTOR(视觉-语义角色探测器),这是一个强大的多模态框架,用于任务,它将基于实体的上下文信息集成到多模态表示中,并将其与多个标准的单模态(仅文本或仅图像)或多模态(图像+文本)模型进行比较。我们的实验结果表明,我们提出的模型比最佳基线提高了4%,比ACL-2022联合任务中最佳的竞争性独立提交高出1%。除了透露了一个广泛的实验设置,进行了比较分析,最后我们突出了在解决Memes内部的语义角色标记的复杂任务中遇到的挑战。