Few-shot semantic segmentation aims to recognize novel classes with only very few labelled data. This challenging task requires mining of the relevant relationships between the query image and the support images. Previous works have typically regarded it as a pixel-wise classification problem. Therefore, various models have been designed to explore the correlation of pixels between the query image and the support images. However, they focus only on pixel-wise correspondence and ignore the overall correlation of objects. In this paper, we introduce a mask-based classification method for addressing this problem. The mask aggregation network (MANet), which is a simple mask classification model, is proposed to simultaneously generate a fixed number of masks and their probabilities of being targets. Then, the final segmentation result is obtained by aggregating all the masks according to their locations. Experiments on both the PASCAL-5^i and COCO-20^i datasets show that our method performs comparably to the state-of-the-art pixel-based methods. This competitive performance demonstrates the potential of mask classification as an alternative baseline method in few-shot semantic segmentation. Our source code will be made available at https://github.com/TinyAway/MANet.
翻译:少见的语义分解法旨在识别新类,只有很少的贴标签数据。 这个具有挑战性的任务要求挖掘查询图像和辅助图像之间的相关关系。 以前的工作通常将它视为像素分类问题。 因此,设计了各种模型来探索查询图像和辅助图像之间的像素关联。 但是,它们只侧重于像素对应,忽视对象的整体相关性。 在本文中, 我们引入了一种基于掩码的分类方法来解决这一问题。 掩码聚合网络( MANet)是一个简单的掩码分类模型, 提议同时生成固定数量的遮罩及其成为目标的可能性。 然后, 最终的分割结果通过根据位置汇集所有遮罩而获得。 对 PASACL-5 ⁇ i 和 CO-20 ⁇ i 数据集的实验显示, 我们的方法可以与基于状态的像素比对。 这种竞争性的性表现展示了将遮罩分类作为一种替代基线方法的可能性, 在几颗图的语义断段段中, 。 我们的来源代码将可以在 http://Ang/ Abutz/ 提供。