Emotion stimulus detection is the task of finding the cause of an emotion in a textual description, similar to target or aspect detection for sentiment analysis. Previous work approached this in three ways, namely (1) as text classification into an inventory of predefined possible stimuli ("Is the stimulus category A or B?"), (2) as sequence labeling of tokens ("Which tokens describe the stimulus?"), and (3) as clause classification ("Does this clause contain the emotion stimulus?"). So far, setting (3) has been evaluated broadly on Mandarin and (2) on English, but no comparison has been performed. Therefore, we aim to answer whether clause classification or sequence labeling is better suited for emotion stimulus detection in English. To accomplish that, we propose an integrated framework which enables us to evaluate the two different approaches comparably, implement models inspired by state-of-the-art approaches in Mandarin, and test them on four English data sets from different domains. Our results show that sequence labeling is superior on three out of four datasets, in both clause-based and sequence-based evaluation. The only case in which clause classification performs better is one data set with a high density of clause annotations. Our error analysis further confirms quantitatively and qualitatively that clauses are not the appropriate stimulus unit in English.
翻译:情感刺激检测是用文字描述找到情感原因的任务,类似于用于情绪分析的目标或方面检测。以前的工作是用三种方式来找到情感原因,即:(1) 将文字分类成预先定义的可能刺激(“刺激A类或B类”)清单,(2) 标记符号的序列标签(“象征刺激?” ),(3) 条款分类(“本条款包含情感刺激吗” ) 。迄今为止,设置(3) 在汉语上得到了广泛的评价,(2) 在英语上得到了广泛的评价,但没有进行过比较。因此,我们的目标是回答条款分类或顺序标签是否更适合在英语中检测情绪刺激。为此,我们提议了一个综合框架,使我们能够比较地评估两种不同的方法,在曼达林采用由目前设计的方法所启发的模式,并在四个不同领域的英文数据集上测试这些模型。我们的结果显示,在基于条款和基于序列的评估中,在四个数据集中,三个数据集的顺序标签优于三个,但是没有进行比较。因此,只有这样的情况是,条款分类进行更好的分类是为了更好地进行英语刺激检测。为了做到,一个高质量的数据是质量分析。