Visual Emotion Analysis (VEA), which aims to predict people's emotions towards different visual stimuli, has become an attractive research topic recently. Rather than a single label classification task, it is more rational to regard VEA as a Label Distribution Learning (LDL) problem by voting from different individuals. Existing methods often predict visual emotion distribution in a unified network, neglecting the inherent subjectivity in its crowd voting process. In psychology, the \textit{Object-Appraisal-Emotion} model has demonstrated that each individual's emotion is affected by his/her subjective appraisal, which is further formed by the affective memory. Inspired by this, we propose a novel \textit{Subjectivity Appraise-and-Match Network (SAMNet)} to investigate the subjectivity in visual emotion distribution. To depict the diversity in crowd voting process, we first propose the \textit{Subjectivity Appraising} with multiple branches, where each branch simulates the emotion evocation process of a specific individual. Specifically, we construct the affective memory with an attention-based mechanism to preserve each individual's unique emotional experience. A subjectivity loss is further proposed to guarantee the divergence between different individuals. Moreover, we propose the \textit{Subjectivity Matching} with a matching loss, aiming at assigning unordered emotion labels to ordered individual predictions in a one-to-one correspondence with the Hungarian algorithm. Extensive experiments and comparisons are conducted on public visual emotion distribution datasets, and the results demonstrate that the proposed SAMNet consistently outperforms the state-of-the-art methods. Ablation study verifies the effectiveness of our method and visualization proves its interpretability.
翻译:视觉情感分析(VEA)旨在预测人们对不同视觉刺激的情绪,它最近已成为一个有吸引力的研究课题。与其单标签分类任务,不如将VEA视为由不同个人投票的标签分发学习(LDL)问题。现有方法往往预测在统一网络中的视觉情感分布,忽视了人群投票过程中固有的主观性。在心理学中, ⁇ textit{Object-Appraisal-Emotion}模型表明, 每个人的情绪都受到他/她的主观评估的影响, 而这又由感官内存进一步形成。 受此启发, 我们提出一个新的 & textitleit{ 偏向分布学习( LDLL), 以调查视觉情感分布的主观性。 为了描述人群投票过程的多样性, 我们首先提出\ textitutitutitle{ { subjectiveAppall} 多个分支, 其中每个分支模拟一个特定个人的情感感官发进程。 具体地说, 我们构建了影响性记忆, 以非关注性对应机制构建了影响性记忆分析机制, 以维护每个人的直观- realalalationalizalation- passionalevild exalizal view view 。