Visual attention is one of the most significant characteristics for selecting and understanding the outside redundancy world. The nature of complex scenes includes enormous redundancy. The human vision system can not process all information simultaneously because of visual information bottleneck. The human visual system mainly focuses on dominant parts of the scenes to reduce the input visual redundancy information. It is commonly known as visual attention prediction or visual saliency map. This paper proposes a new psychophysical saliency prediction architecture, WECSF, inspired by human low-level visual cortex function. The model consists of opponent color channels, wavelet transform, wavelet energy map, and contrast sensitivity function for extracting low-level image features and maximum approximation to the human visual system. The proposed model is evaluated several datasets, including MIT1003, MIT300, TORONTO, SID4VAM and UCF Sports dataset to explain its efficiency. We also quantitatively and qualitatively compared the performance of saliency prediction with other state-of-the-art models. Our model achieved very stable and good performance. Second, we also confirmed that Fourier and spectral-inspired saliency prediction models achieved outperformance compared to other start-of-the-art non-neural networks and even deep neural network models on psychophysical synthesis images. Finally, the proposed model also can be applied to spatial-temporal saliency prediction and got better performance.
翻译:视觉关注是选择和理解外部冗余世界的最重要特征之一。 复杂场景的性质包括巨大的冗余。 人类视觉系统不能同时处理所有信息, 因为视觉信息瓶颈。 人类视觉系统主要侧重于场景的主要部分, 以减少输入的视觉冗余信息。 它通常被称为视觉关注预测或视觉显著地图。 本文提出了一个新的心理物理显著预测结构, 即WECSF, 受人类低水平视觉皮质功能的启发。 模型由对手颜色频道、 波盘变换、 波盘能量映射和对比感应功能组成, 用于提取低级别图像特征和人类视觉系统的最大近似值的对比感应功能。 所拟议的模型被评估的数据集包括MIT1003、 MIT300、 TORONTO、 SID4VAM 和 UCFC 体育数据集, 以解释其效率。 我们还从量和质上将显性预测与其他状态的视觉皮质模型进行比较。 我们的模型取得了非常稳定和良好的性能。 其次, 我们还确认, 4级和光谱显著的显微预测模型比其他开始的网络和非空间图像化模型还被应用。