Efficiently training accurate deep models for weakly supervised semantic segmentation (WSSS) with image-level labels is challenging and important. Recently, end-to-end WSSS methods have become the focus of research due to their high training efficiency. However, current methods suffer from insufficient extraction of comprehensive semantic information, resulting in low-quality pseudo-labels and sub-optimal solutions for end-to-end WSSS. To this end, we propose a simple and novel Self Correspondence Distillation (SCD) method to refine pseudo-labels without introducing external supervision. Our SCD enables the network to utilize feature correspondence derived from itself as a distillation target, which can enhance the network's feature learning process by complementing semantic information. In addition, to further improve the segmentation accuracy, we design a Variation-aware Refine Module to enhance the local consistency of pseudo-labels by computing pixel-level variation. Finally, we present an efficient end-to-end Transformer-based framework (TSCD) via SCD and Variation-aware Refine Module for the accurate WSSS task. Extensive experiments on the PASCAL VOC 2012 and MS COCO 2014 datasets demonstrate that our method significantly outperforms other state-of-the-art methods. Our code is available at {https://github.com/Rongtao-Xu/RepresentationLearning/tree/main/SCD-AAAI2023}.
翻译:高效地培训精密精密模型,以使用图像级标签进行监管不力的语义分解(WSSSS)是具有挑战性和重要性的。最近,端到端的SSS方法因其高培训效率而成为研究的重点。然而,当前方法由于全面语义信息提取不足而受到影响,导致端到端的SSS的伪标签和亚最佳解决方案质量低,从而导致端到端的语义分解(WSSSS)。为此,我们提出一种简单和新颖的SCD蒸馏(SCD)方法,以在不引入外部监督的情况下改进假名词流。我们的SCD使网络能够利用从自身产生的地义通信作为蒸馏目标,通过补充语义信息来增强网络的特征学习进程。此外,为了进一步提高语义分解精度的准确性,我们设计了一个蒸馏-觉Refinal refinal 模块,通过计算像素级的变异度,提高伪标的当地一致性。最后,我们通过SCD和VAREAREASA系统在2012年的精确数据系统中展示了我们SAS-SAS-O-SAR-SAS-SAL-SAR的系统的系统其他系统。</s>