利用大型语言模型实现上下文感知的隐式文本与多模态仇恨言论检测 (Leveraging LLMs for Context-Aware Implicit Textual and Multimodal Hate Speech Detection)

This research introduces a novel approach to textual and multimodal Hate Speech Detection (HSD), using Large Language Models (LLMs) as dynamic knowledge bases to generate background context and incorporate it into the input of HSD classifiers. Two context generation strategies are examined: one focused on named entities and the other on full-text prompting. Four methods of incorporating context into the classifier input are compared: text concatenation, embedding concatenation, a hierarchical transformer-based fusion, and LLM-driven text enhancement. Experiments are conducted on the textual Latent Hatred dataset of implicit hate speech and applied in a multimodal setting on the MAMI dataset of misogynous memes. Results suggest that both the contextual information and the method by which it is incorporated are key, with gains of up to 3 and 6 F1 points on textual and multimodal setups respectively, from a zero-context baseline to the highest-performing system, based on embedding concatenation.

翻译：本研究提出了一种新颖的文本与多模态仇恨言论检测方法，该方法将大型语言模型作为动态知识库，用于生成背景语境并将其整合至仇恨言论检测分类器的输入中。研究探讨了两种语境生成策略：一种聚焦于命名实体，另一种采用全文提示。同时比较了四种将语境整合至分类器输入的方法：文本拼接、嵌入向量拼接、基于分层Transformer的融合方法以及LLM驱动的文本增强技术。实验在隐式仇恨言论文本数据集Latent Hatred上进行，并在多模态场景中应用于厌女迷因数据集MAMI。结果表明，语境信息及其整合方式均具有关键作用：从零语境基线到基于嵌入拼接的最高性能系统，在文本与多模态设置中分别实现了最高达3个和6个F1值的性能提升。