Semantic segmentation based on sparse annotation has advanced in recent years. It labels only part of each object in the image, leaving the remainder unlabeled. Most of the existing approaches are time-consuming and often necessitate a multi-stage training strategy. In this work, we propose a simple yet effective sparse annotated semantic segmentation framework based on segformer, dubbed SASFormer, that achieves remarkable performance. Specifically, the framework first generates hierarchical patch attention maps, which are then multiplied by the network predictions to produce correlated regions separated by valid labels. Besides, we also introduce the affinity loss to ensure consistency between the features of correlation results and network predictions. Extensive experiments showcase that our proposed approach is superior to existing methods and achieves cutting-edge performance. The source code is available at \url{https://github.com/su-hui-zz/SASFormer}.
翻译:近些年来,根据稀疏的注解而形成的语义分解系统已经取得了进展。 它在图像中只标出每个对象的一部分, 其余部分没有标注。 大多数现有方法都是耗时的, 往往需要多阶段的培训战略。 在这项工作中, 我们提出了一个简单而有效的、 有效的、 以SASFormer 组合为基础的附加说明的语义分解框架, 取得显著的性能。 具体地说, 该框架首先生成了等级分解关注图, 然后通过网络预测来乘以以以产生由有效标签分离的关联区域。 此外, 我们还引入了亲近性损失, 以确保相关结果和网络预测的特征的一致性。 广泛的实验显示, 我们所建议的方法优于现有方法, 并实现了最尖端的性能。 源代码可在\url{ https://github.com/su-hui-zz/SASFormer}查阅 https:// github.com/su- hu-zz/SASFormer} 。