在SemEval-2022任务5:用多媒体自动雾感应识别图像感应和图图变网络加强UNITER (UPB at SemEval-2022 Task 5: Enhancing UNITER with Image Sentiment and Graph Convolutional Networks for Multimedia Automatic Misogyny Identification)

In recent times, the detection of hate-speech, offensive, or abusive language in online media has become an important topic in NLP research due to the exponential growth of social media and the propagation of such messages, as well as their impact. Misogyny detection, even though it plays an important part in hate-speech detection, has not received the same attention. In this paper, we describe our classification systems submitted to the SemEval-2022 Task 5: MAMI - Multimedia Automatic Misogyny Identification. The shared task aimed to identify misogynous content in a multi-modal setting by analysing meme images together with their textual captions. To this end, we propose two models based on the pre-trained UNITER model, one enhanced with an image sentiment classifier, whereas the second leverages a Vocabulary Graph Convolutional Network (VGCN). Additionally, we explore an ensemble using the aforementioned models. Our best model reaches an F1-score of 71.4% in Sub-task A and 67.3% for Sub-task B positioning our team in the upper third of the leaderboard. We release the code and experiments for our models on GitHub

翻译：近些年来,在在线媒体中发现仇恨言论、攻击性语言或滥用性语言已成为全国语言方案研究的一个重要专题,原因是社交媒体的指数增长以及传播这类信息及其影响,成为了全国语言方案研究的一个重要专题。Misogyny的检测,尽管在煽动仇恨言论的检测中起着重要作用,但却没有得到同样的关注。在本文中,我们描述了我们提交SemEval-2022任务5:MAMI - 多媒体自动Misogyny识别的分类系统。共同的任务是通过分析Meme图像及其文字说明,在多模式环境中识别错误的相异内容。为此,我们提出了基于预先培训的UNITER模型的两个模型,一个模型以图像感知分解器得到加强,而第二个模型则利用了VGCN。此外,我们利用上述模型探索了一个集成元素。我们最好的模型在子任务A和67.3%的F1芯片B组将我们的团队定位在GiH平台的第三层实验中。我们为GiH领导者发布了代码。