Conventional saliency prediction models typically learn a deterministic mapping from images to the corresponding ground truth saliency maps. In this paper, we study the saliency prediction problem from the perspective of generative models by learning a conditional probability distribution over saliency maps given an image, and treating the prediction as a sampling process. Specifically, we propose a generative cooperative saliency prediction framework based on the generative cooperative networks, where a conditional latent variable model and a conditional energy-based model are jointly trained to predict saliency in a cooperative manner. We call our model the SalCoopNets. The latent variable model serves as a fast but coarse predictor to efficiently produce an initial prediction, which is then refined by the iterative Langevin revision of the energy-based model that serves as a fine predictor. Such a coarse-to-fine cooperative saliency prediction strategy offers the best of both worlds. Moreover, we generalize our framework to the scenario of weakly supervised saliency prediction, where saliency annotation of training images is partially observed, by proposing a cooperative learning while recovering strategy. Lastly, we show that the learned energy function can serve as a refinement module that can refine the results of other pre-trained saliency prediction models. Experimental results show that our generative model can achieve state-of-the-art performance. Our code is publicly available at: \url{https://github.com/JingZhang617/SalCoopNets}.
翻译:常规显著预测模型通常从图像到相应的地面事实显眼图中学习确定性绘图。 在本文中,我们从基因模型的角度研究显著的预测问题,方法是从图像中学习一个有条件的概率分布超过显眼图,并将预测作为抽样过程处理。具体地说,我们提出一个基于基因合作网络的基因化合作显著预测框架,其中以有条件的潜伏变量模型和有条件的能源模型为共同培训,以便以合作的方式预测显著性;我们称我们的模型为SalCoopNets。潜伏变量模型作为快速但粗略的预测器,以便高效地作出初步预测,然后由能源模型的迭代版本加以完善,作为精细的预测器加以完善。这种粗略至非全方位的合作显著预测战略为两个世界提供了最好的条件。此外,我们将我们的框架概括化了监督性显著性预测情景,通过在恢复战略中提出合作性学习部分地观察到培训图像。最后,我们展示了所学的能源模型功能可以作为精确的初步预测,然后由能源模型的兰戈文修订模型加以完善。 我们现有的基因模型可以改进。