Subjectivity and difference of opinion are key social phenomena, and it is crucial to take these into account in the annotation and detection process of derogatory textual content. In this paper, we use four datasets provided by SemEval-2023 Task 11 and fine-tune a BERT model to capture the disagreement in the annotation. We find individual annotator modeling and aggregation lowers the Cross-Entropy score by an average of 0.21, compared to the direct training on the soft labels. Our findings further demonstrate that annotator metadata contributes to the average 0.029 reduction in the Cross-Entropy score.
翻译:主观性和不同意见是关键的社会现象,在侮辱性文本内容的注释和检测过程中考虑这些因素至关重要。在本文中,我们使用SemEval-2023任务11提供的四个数据集,并微调BERT模型来捕捉注释中的不一致性。我们发现单个注释者建模和聚合可以将交叉熵分数平均降低0.21,而直接训练软标签则更低。我们的发现进一步证明,注释者元数据有助于平均交叉熵分数降低0.029。