AnnoBERT: 有效代表多个通知员的标签选择,改进仇恨言论检测 (AnnoBERT: Effectively Representing Multiple Annotators' Label Choices to Improve Hate Speech Detection)

Supervised approaches generally rely on majority-based labels. However, it is hard to achieve high agreement among annotators in subjective tasks such as hate speech detection. Existing neural network models principally regard labels as categorical variables, while ignoring the semantic information in diverse label texts. In this paper, we propose AnnoBERT, a first-of-its-kind architecture integrating annotator characteristics and label text with a transformer-based model to detect hate speech, with unique representations based on each annotator's characteristics via Collaborative Topic Regression (CTR) and integrate label text to enrich textual representations. During training, the model associates annotators with their label choices given a piece of text; during evaluation, when label information is not available, the model predicts the aggregated label given by the participating annotators by utilising the learnt association. The proposed approach displayed an advantage in detecting hate speech, especially in the minority class and edge cases with annotator disagreement. Improvement in the overall performance is the largest when the dataset is more label-imbalanced, suggesting its practical value in identifying real-world hate speech, as the volume of hate speech in-the-wild is extremely small on social media, when compared with normal (non-hate) speech. Through ablation studies, we show the relative contributions of annotator embeddings and label text to the model performance, and tested a range of alternative annotator embeddings and label text combinations.

翻译：以多数为主的监管方式通常依赖基于多数的标签。然而,很难在注意到仇恨言论等主观任务(如仇恨言论检测)的批注者之间达成高度一致。现有的神经网络模型主要将标签视为绝对变量,而忽略了不同标签文本中的语义信息。在本文中,我们提议了“AnnoBERT”这一首选结构,它包含批注特点和标签文本,带有基于变压器的变压器模型,以检测仇恨言论,根据每个批注者的特点,通过协作标签标签回归(CTR)和整合标签文本以丰富文本表达方式。在培训期间,模型将批注者与其标签选择的文本联系起来;在评估期间,当标签信息不可用时,模型通过利用学会的关联来预测与会的批注者给出的综合标签。拟议的方法在发现仇恨言论方面显示出优势,特别是在少数类和边缘案例中,且有示范方的不同意见。当数据集更加贴合时,总体业绩的改善是最大的,当数据集更具标签平衡性时,表明其实际价值,在识别现实世界仇恨言论的选址选择中,在极端的缩缩缩的言语中,在展示中,通过仇恨言论中,通过仇恨言论的缩缩中,在展示中,在展示中,通过仇恨言论中显示一个非常的文本中,在展示中,一个非常的一段的文本中,在展示的缩略。