Emotion detection is an established NLP task of demonstrated utility for text understanding. However, basic emotion detection leaves out key information, namely, who is experiencing the emotion in question. For example, it may be the author, the narrator, or a character; or the emotion may correspond to something the audience is supposed to feel, or even be unattributable to a specific being, e.g., when emotions are being discussed per se. We provide the ABBE corpus -- Animate Beings Being Emotional -- a new double-annotated corpus of texts that captures this key information for one class of emotion experiencer, namely, animate beings in the world described by the text. Such a corpus is useful for developing systems that seek to model or understand this specific type of expressed emotion. Our corpus contains 30 chapters, comprising 134,513 words, drawn from the Corpus of English Novels, and contains 2,010 unique emotion expressions attributable to 2,227 animate beings. The emotion expressions are categorized according to Plutchik's 8-category emotion model, and the overall inter-annotator agreement for the annotations was 0.83 Cohen's Kappa, indicating excellent agreement. We describe in detail our annotation scheme and procedure, and also release the corpus for use by other researchers.
翻译:然而,基本的情绪检测遗漏了关键信息,即,谁正在经历该情绪。例如,可能是作者、旁白人或性格;或者情感可能与观众应该感受到的事物相对,甚至与特定事物不相干,例如,当情绪本身正在讨论时。我们提供了ABBEPROPOR -- -- 情感动因 -- -- 新的附加说明的文本汇编,为某类情感体验者,即文本描述的世界中的同龄人,提供了这一关键信息。这种材料对于发展寻求模拟或理解这种特定类型表达的情感的系统很有用。我们的文体包含30章,包括134,513个字,摘自英国新书的Corpus, 包含2,227个同龄人的独特情感表达。情感表达按Plutchik的8类情感模型分类, 以及总体的同义协议是:0.83, 并用我们的详细程序说明我们的数据、 我们的Cappa程序、 我们的解说程序、 我们的解说程序、 我们的解说程序、 我们的解说、 我们的解说、 我们的解说程序,还用一个极的解说。