Despite the great progress made by deep neural networks in the semantic segmentation task, traditional neural-networkbased methods typically suffer from a shortage of large amounts of pixel-level annotations. Recent progress in fewshot semantic segmentation tackles the issue by only a few pixel-level annotated examples. However, these few-shot approaches cannot easily be applied to multi-way or weak annotation settings. In this paper, we advance the few-shot segmentation paradigm towards a scenario where image-level annotations are available to help the training process of a few pixel-level annotations. Our key idea is to learn a better prototype representation of the class by fusing the knowledge from the image-level labeled data. Specifically, we propose a new framework, called PAIA, to learn the class prototype representation in a metric space by integrating image-level annotations. Furthermore, by considering the uncertainty of pseudo-masks, a distilled soft masked average pooling strategy is designed to handle distractions in image-level annotations. Extensive empirical results on two datasets show superior performance of PAIA.
翻译:尽管深层神经网络在语义分解任务方面取得了巨大进展,但传统神经网络方法通常缺乏大量像素级说明,因此缺乏大量像素级说明。最近一些像素级的语义分解方面的进展仅通过几个像素级附加说明的例子来解决这个问题。然而,这些微小的分解方法无法很容易地应用于多路或微弱的注解设置。在本文件中,我们将微小的分解模式推向一种设想方案,即可以提供图像级说明以帮助培训几个像素级说明的过程。我们的关键想法是利用图像级标签数据的知识来学习更好的班级原型说明。具体地说,我们提出了一个称为“PAIAIA”的新框架,通过整合图像级说明来学习一个计量空间的班级原型。此外,通过考虑假面的不确定性,一种蒸馏软面面平均集合战略旨在处理图像级说明中的分流。关于两个数据集的广泛经验结果显示PAIA的优异性表现。