Emotion intensity prediction determines the degree or intensity of an emotion that the author expresses in a text, extending previous categorical approaches to emotion detection. While most previous work on this topic has concentrated on English texts, other languages would also benefit from fine-grained emotion classification, preferably without having to recreate the amount of annotated data available in English in each new language. Consequently, we explore cross-lingual transfer approaches for fine-grained emotion detection in Spanish and Catalan tweets. To this end we annotate a test set of Spanish and Catalan tweets using Best-Worst scaling. We compare six cross-lingual approaches, e.g., machine translation and cross-lingual embeddings, which have varying requirements for parallel data -- from millions of parallel sentences to completely unsupervised. The results show that on this data, methods with low parallel-data requirements perform surprisingly better than methods that use more parallel data, which we explain through an in-depth error analysis. We make the dataset and the code available at \url{https://github.com/jerbarnes/fine-grained_cross-lingual_emotion}
翻译:情感强度预测决定了作者在文本中表达的情感的程度或强度,扩大了先前对情绪检测的绝对方法。虽然以前关于这一专题的工作大多集中在英文文本上,但其他语文也将受益于细微的情感分类,最好不必再用每种新语言重新生成英文附加说明的数据数量。因此,我们探索了西班牙语和加泰罗尼亚语微调微微微感性情绪检测的跨语言传输方法。我们为此通过最深的错误分析来说明一套测试西班牙和加泰罗尼亚语微博的测试方法。我们比较了六种跨语言方法,例如机器翻译和跨语言嵌入,这些方法对平行数据的要求各不相同 -- -- 从数以百万计的平行句子到完全无超音化。结果显示,在这些数据上,低平行数据要求的方法比我们通过深入的错误分析来解释的更平行数据方法要好得多。我们在\url {http://github.com/jerbarnes/fine-graed_coveral_emotion_sution}