Few-shot segmentation is a challenging dense prediction task, which entails segmenting a novel query image given only a small annotated support set. The key problem is thus to design a method that aggregates detailed information from the support set, while being robust to large variations in appearance and context. To this end, we propose a few-shot segmentation method based on dense Gaussian process (GP) regression. Given the support set, our dense GP learns the mapping from local deep image features to mask values, capable of capturing complex appearance distributions. Furthermore, it provides a principled means of capturing uncertainty, which serves as another powerful cue for the final segmentation, obtained by a CNN decoder. Instead of a one-dimensional mask output, we further exploit the end-to-end learning capabilities of our approach to learn a high-dimensional output space for the GP. Our approach sets a new state-of-the-art for both 1-shot and 5-shot FSS on the PASCAL-5$^i$ and COCO-20$^i$ benchmarks, achieving an absolute gain of $+14.9$ mIoU in the COCO-20$^i$ 5-shot setting. Furthermore, the segmentation quality of our approach scales gracefully when increasing the support set size, while achieving robust cross-dataset transfer.
翻译:微小的截面图是一项具有挑战性的密集预测任务,它需要将新奇的查询图像分割成一个仅提供少量附加说明的支持集,因此关键的问题是设计一种方法,将支持集的详细信息汇总起来,同时在外观和上下文上有很大的变异。为此,我们提出一个基于密集高斯进程回归的微小截面图方法。鉴于支持集,我们密集的GP从当地深层图像特征到掩码值的映射中可以捕捉复杂的外观分布。此外,它提供了一种有原则的捕捉不确定性的手段,作为由CNN调解码器获取的最后分割的又一个强有力的提示。我们进一步利用我们方法的端到端学习能力,为GP学习高维输出空间。我们的方法为1分和5分光谱的图像图设置了新的状态,能够捕捉到复杂的外观分布分布。此外,它提供了一种有原则的捕捉不确定性的手段,作为由CNN调频解码器获得的最后分割的又一个强大的提示。我们的方法不是一维面遮面面面的输出,而是利用我们的方法的学习能力来学习GPPPGPGPGPA的高级输出空间。我们的高维面输出空间空间空间。我们的1个截面图为1个截面图和5分级平平面图,而同时,在C-2020平平平平平平平平平平平平平平平平平平平平平平平平平平平平平平平平平面平面平面的平面图上。