When humans judge the affective content of texts, they also implicitly assess the correctness of such judgment, that is, their confidence. We hypothesize that people's (in)confidence that they performed well in an annotation task leads to (dis)agreements among each other. If this is true, confidence may serve as a diagnostic tool for systematic differences in annotations. To probe our assumption, we conduct a study on a subset of the Corpus of Contemporary American English, in which we ask raters to distinguish neutral sentences from emotion-bearing ones, while scoring the confidence of their answers. Confidence turns out to approximate inter-annotator disagreements. Further, we find that confidence is correlated to emotion intensity: perceiving stronger affect in text prompts annotators to more certain classification performances. This insight is relevant for modelling studies of intensity, as it opens the question wether automatic regressors or classifiers actually predict intensity, or rather human's self-perceived confidence.
翻译:当人类判断文本的感性内容时,他们也暗含地评估了这种判断的正确性,即他们的信心。我们假设人们相信他们在批注任务中表现良好会导致(不同)分歧。如果这是事实,信任可以作为判断说明中系统性差异的诊断工具。为了调查我们的假设,我们研究当代美国英语Corpus的子集,我们要求评分者区分中性句和情感含有的句子,同时评分他们的答复的信心。信任的结果是接近于顾问之间的分歧。此外,我们发现信心与情绪强度相关:看到对文本的更强影响会促使警告者进行更明确的分类性表现。这种洞见对于模拟强度研究是相关的,因为它开启了问题,即自动递减者或分类者实际上预测了强度,或者说是人类的自觉信心。