Visual sentiment analysis has received increasing attention in recent years. However, the dataset's quality is a concern because the sentiment labels are crowd-sourcing, subjective, and prone to mistakes, and poses a severe threat to the data-driven models, especially the deep neural networks. The deep models would generalize poorly on the testing cases when trained to over-fit the training samples with noisy sentiment labels. Inspired by the recent progress on learning with noisy labels, we propose a robust learning method to perform robust visual sentiment analysis. Our method relies on external memory to aggregate and filters noisy labels during training. The memory is composed of the prototypes with corresponding labels, which can be updated online. The learned prototypes and their labels can be regarded as denoising features and labels for the local regions and can guide the training process to prevent the model from overfitting the noisy cases. We establish a benchmark for visual sentiment analysis with label noise using publicly available datasets. The experiment results of the proposed benchmark settings comprehensively show the effectiveness of our method.
翻译:视觉感知分析近年来受到越来越多的关注。然而,数据集的质量是一个令人关切的问题,因为情绪标签是众包、主观和容易出错的,对数据驱动模型,特别是深神经网络构成了严重威胁。深度模型在训练过量安装热感应标签时,对测试案例的概括性不甚完善。在使用噪音标签学习的最新进展的启发下,我们提出了一个强有力的学习方法来进行稳健的视觉感知分析。我们的方法依靠外部记忆来综合和过滤在培训期间的吵闹标签。记忆由带有相应标签的原型组成,可以在网上更新。学到的原型和标签可以被视为对当地区域进行脱色特征和标签,并能够指导培训过程,防止模型过分适应噪音标签。我们用公开的数据集建立一个视觉感分析基准。拟议基准设置的实验结果全面显示了我们的方法的有效性。