Semantic segmentation is an important and prevalent task, but severely suffers from the high cost of pixel-level annotations when extending to more classes in wider applications. To this end, we focus on the problem named weak-shot semantic segmentation, where the novel classes are learnt from cheaper image-level labels with the support of base classes having off-the-shelf pixel-level labels. To tackle this problem, we propose SimFormer, which performs dual similarity transfer upon MaskFormer. Specifically, MaskFormer disentangles the semantic segmentation task into two sub-tasks: proposal classification and proposal segmentation for each proposal. Proposal segmentation allows proposal-pixel similarity transfer from base classes to novel classes, which enables the mask learning of novel classes. We also learn pixel-pixel similarity from base classes and distill such class-agnostic semantic similarity to the semantic masks of novel classes, which regularizes the segmentation model with pixel-level semantic relationship across images. In addition, we propose a complementary loss to facilitate the learning of novel classes. Comprehensive experiments on the challenging COCO-Stuff-10K and ADE20K datasets demonstrate the effectiveness of our method. Codes are available at https://github.com/bcmi/SimFormer-Weak-Shot-Semantic-Segmentation.
翻译:语义分解是一项重要而普遍的任务, 但是在将像素分解扩展至更广泛的应用中更多的类时, 却严重地承受了像素分解的高昂成本。 为此, 我们集中关注名为弱光分解的语义分解问题, 在那里, 小类从更廉价的图像级标签中学习, 支持基础类类的比喻, 且在像素等级标签上使用现成的像素分解标签。 为了解决这个问题, 我们建议SimFormer, 它在Mask Former 上执行双重相似性转移。 具体来说, Mask Former 将语义分解任务分解为两个子任务: 提案分类和提案分解。 提案分解允许将建议- 像素相似性从基础类转至新类, 从而能够学习新类的遮掩码。 我们还学习基础类的像素- 像素类相似性等, 并且将这种类的语义性定义性与新类的语系遮掩码相类似。 将分解模式与图像的像- 20 级的语系分解关系相互调节。 此外, 我们提议要进行补充损失, 学习新类的S- sad- k- sad- slax- 的解- s- s- sad- sal- sal- sal- ex- forus- sal- ex- ex- ex- ex- ex- s- ex- sal- sal- sal- sal- ex- sal- sal- ex- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- slection- laction- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal- sal