Sparsely annotated semantic segmentation (SASS) aims to train a segmentation network with coarse-grained (i.e., point-, scribble-, and block-wise) supervisions, where only a small proportion of pixels are labeled in each image. In this paper, we propose a novel tree energy loss for SASS by providing semantic guidance for unlabeled pixels. The tree energy loss represents images as minimum spanning trees to model both low-level and high-level pair-wise affinities. By sequentially applying these affinities to the network prediction, soft pseudo labels for unlabeled pixels are generated in a coarse-to-fine manner, achieving dynamic online self-training. The tree energy loss is effective and easy to be incorporated into existing frameworks by combining it with a traditional segmentation loss. Compared with previous SASS methods, our method requires no multistage training strategies, alternating optimization procedures, additional supervised data, or time-consuming post-processing while outperforming them in all SASS settings. Code is available at https://github.com/megviiresearch/TEL.
翻译:粗略注解的语义分解( SASS) 旨在训练一个具有粗略( 点、 点、 点、 点、 和块- 条) 监督的分解网络, 在每个图像中只贴上一小部分像素的标签。 在本文中, 我们建议为 SASS 提供一种新的树能量损失, 为没有标签的像素提供语义指导。 树能量损失代表了最小的横跨树木的图像, 以模拟低层次和高层次双亲亲近关系。 通过在网络预测中按顺序应用这些近似关系, 无标签的像素的软假标签以粗略至直观的方式生成, 实现动态的在线自我训练。 树能损失是有效的, 并且很容易通过将其与传统的分解损失合并纳入现有框架。 与 SASSAS 方法相比, 我们的方法不需要多阶段培训策略、 交替优化程序、 额外的监管数据或耗时后处理, 同时在所有 SASSAS 设置中表现它们。 https://github. com/ egreshis.