消除情感偏见,通过反动数据收集消除情感偏见 (It is Okay to Not Be Okay: Overcoming Emotional Bias in Affective Image Captioning by Contrastive Data Collection)

Datasets that capture the connection between vision, language, and affection are limited, causing a lack of understanding of the emotional aspect of human intelligence. As a step in this direction, the ArtEmis dataset was recently introduced as a large-scale dataset of emotional reactions to images along with language explanations of these chosen emotions. We observed a significant emotional bias towards instance-rich emotions, making trained neural speakers less accurate in describing under-represented emotions. We show that collecting new data, in the same way, is not effective in mitigating this emotional bias. To remedy this problem, we propose a contrastive data collection approach to balance ArtEmis with a new complementary dataset such that a pair of similar images have contrasting emotions (one positive and one negative). We collected 260,533 instances using the proposed method, we combine them with ArtEmis, creating a second iteration of the dataset. The new combined dataset, dubbed ArtEmis v2.0, has a balanced distribution of emotions with explanations revealing more fine details in the associated painting. Our experiments show that neural speakers trained on the new dataset improve CIDEr and METEOR evaluation metrics by 20% and 7%, respectively, compared to the biased dataset. Finally, we also show that the performance per emotion of neural speakers is improved across all the emotion categories, significantly on under-represented emotions. The collected dataset and code are available at https://artemisdataset-v2.org.

翻译：反映视觉、语言和感情之间联系的数据集有限,导致对人类智能情感方面缺乏理解。作为朝这个方向迈出的一步,ArtEmis数据集最近被引入,作为对图像情感反应的大规模数据集,同时对这些选定的情感进行语言解释。我们观察到了对事件丰富情感的重大情感偏向,使受过训练的神经语言者在描述代表性不足的情感方面不太准确。我们表明,以同样的方式收集新数据对减轻这种情感偏见是无效的。为了纠正这一问题,我们建议采用对比性数据收集方法,使ArtEmis与新的互补数据集保持平衡,使一对相类似的图像具有对比情绪(一个正数和一个负数 ) 。我们用拟议方法收集了260,533个实例,与ArtEmis对事件作了解释。我们用Atemis, 将数据与数据组的第二个迭代号合并起来,使Atemismismission2 得到平衡的情感分布,并解释相关绘画中更精确的细节。我们的实验表明,在新数据组中训练的神经演讲者改进了CIDER和MEOR 度的图像,最后显示我们分别通过20 % 度数据显示了所有的情绪分析度数据, 度数据组的成绩显示, 和每一度的等级分析度数据在每一度数据组中,在每一度评估之下分别显示了20和每一度数据组中,在每一度数据中分别显示。