Conventional saliency prediction models typically learn a deterministic mapping from an image to its saliency map, and thus fail to explain the subjective nature of human attention. In this paper, to model the uncertainty of visual saliency, we study the saliency prediction problem from the perspective of generative models by learning a conditional probability distribution over the saliency map given an input image, and treating the saliency prediction as a sampling process from the learned distribution. Specifically, we propose a generative cooperative saliency prediction framework, where a conditional latent variable model (LVM) and a conditional energy-based model (EBM) are jointly trained to predict salient objects in a cooperative manner. The LVM serves as a fast but coarse predictor to efficiently produce an initial saliency map, which is then refined by the iterative Langevin revision of the EBM that serves as a slow but fine predictor. Such a coarse-to-fine cooperative saliency prediction strategy offers the best of both worlds. Moreover, we propose a "cooperative learning while recovering" strategy and apply it to weakly supervised saliency prediction, where saliency annotations of training images are partially observed. Lastly, we find that the learned energy function in the EBM can serve as a refinement module that can refine the results of other pre-trained saliency prediction models. Experimental results show that our model can produce a set of diverse and plausible saliency maps of an image, and obtain state-of-the-art performance in both fully supervised and weakly supervised saliency prediction tasks.
翻译:常规显著的预测模型通常会从图像到其突出的分布图中学会确定性的绘图,从而无法解释人类注意力的主观性质。 在本文中,为了模拟视觉显著性的不确定性,我们从基因模型的角度研究显著的预测问题,方法是从输入图像的突出的地图上学习有条件的概率分布,并将突出的预测作为从所学分布中得出的抽样过程。具体地说,我们提议了一个基因化的合作显著预测框架,在这个框架中,一个有条件的潜伏变量模型(LVM)和一个有监督的能源模型(EBM)被联合训练,以便以合作的方式预测突出的物体。LVM是一个快速但粗略的预测器,以便有效地绘制一个显要的地图,然后通过对EBM系统进行反复的兰戈文修订加以完善,该版本是一个缓慢但精细的预测,并将突出的预测作为从所学到的显著的显著的预测战略为两个世界的最好选择。 此外,我们提议了一个“在恢复过程中合作学习”的战略,并将它应用到薄弱的显著的显著的预测,在其中,对培训图像的显著的突出的描述作了部分观察。 最后,我们发现,一个在精确的预测模型中学习的精细化的能源模型中可以产生另一个的精细化的模型的模型的精细化结果。