Recently, the concept of unsupervised learning for superpixel segmentation via CNNs has been studied. Essentially, such methods generate superpixels by convolutional neural network (CNN) employed on a single image, and such CNNs are trained without any labels or further information. Thus, such approach relies on the incorporation of priors, typically by designing an objective function that guides the solution towards a meaningful superpixel segmentation. In this paper we propose three key elements to improve the efficacy of such networks: (i) the similarity of the \emph{soft} superpixelated image compared to the input image, (ii) the enhancement and consideration of object edges and boundaries and (iii) a modified architecture based on atrous convolution, which allow for a wider field of view, functioning as a multi-scale component in our network. By experimenting with the BSDS500 dataset, we find evidence to the significance of our proposal, both qualitatively and quantitatively.
翻译:最近,对通过CNN进行超像素分解的无监督学习概念进行了研究。基本上,这些方法通过单一图像使用的神经神经网络生成超像素,而这种CNN则在没有任何标签或进一步信息的情况下接受培训。因此,这种方法依赖于将先入为主,通常通过设计一个客观功能来指导实现有意义的超级像素分解的解决方案。在本文件中,我们提出了提高这类网络效率的三个关键要素:(一) 超像素图像与输入图像相似;(二) 对象边缘和边界的增强和考虑,以及(三) 一种以生动变异为基础的经过修改的结构,它允许更广泛的视野,作为我们网络中一个多尺度的组成部分发挥作用。我们通过BSDS500数据集的实验,发现了我们提案在质量和数量上的意义的证据。