The key challenge for few-shot semantic segmentation (FSS) is how to tailor a desirable interaction among support and query features and/or their prototypes, under the episodic training scenario. Most existing FSS methods implement such support-query interactions by solely leveraging plain operations - e.g., cosine similarity and feature concatenation - for segmenting the query objects. However, these interaction approaches usually cannot well capture the intrinsic object details in the query images that are widely encountered in FSS, e.g., if the query object to be segmented has holes and slots, inaccurate segmentation almost always happens. To this end, we propose a dynamic prototype convolution network (DPCN) to fully capture the aforementioned intrinsic details for accurate FSS. Specifically, in DPCN, a dynamic convolution module (DCM) is firstly proposed to generate dynamic kernels from support foreground, then information interaction is achieved by convolution operations over query features using these kernels. Moreover, we equip DPCN with a support activation module (SAM) and a feature filtering module (FFM) to generate pseudo mask and filter out background information for the query images, respectively. SAM and FFM together can mine enriched context information from the query features. Our DPCN is also flexible and efficient under the k-shot FSS setting. Extensive experiments on PASCAL-5i and COCO-20i show that DPCN yields superior performances under both 1-shot and 5-shot settings.
翻译:微粒语义分解(FSS)的关键挑战是如何在偶发培训情景下调整支持和查询功能和(或)原型之间的适当互动。大多数现有的FSS方法仅通过利用普通操作(例如,焦相近和特征交集)来实施这种支持-请求互动,以分割查询对象。然而,这些互动方法通常无法很好地捕捉在FSS广泛遇到的查询图像中固有的目标细节,例如,如果要分割的查询对象有洞和空,不准确的分解几乎总是发生。为此,我们提议建立一个动态原型共变网络(DPCN),以充分捕捉上述准确FSS的内在细节。具体地说,在DPCN中,一个动态共变异模块(DCM)首先提议从对地面的支持中产生动态内核内核,然后通过在使用这些内核内核质的查询特性上的演变动操作来实现信息互动。此外,我们为DPCN提供了一个支持启动模块,以及一个专题筛选模块(FFM)以充分捕捉上述原型原型原型原型的5号组合图,在SAM和SMA-DMA-S-R的图像下分别显示高压图像。