Autoencoding, which aims to reconstruct the input images through a bottleneck latent representation, is one of the classic feature representation learning strategies. It has been shown effective as an auxiliary task for semi-supervised learning but has become less popular as more sophisticated methods have been proposed in recent years. In this paper, we revisit the idea of using image reconstruction as the auxiliary task and incorporate it with a modern semi-supervised semantic segmentation framework. Surprisingly, we discover that such an old idea in semi-supervised learning can produce results competitive with state-of-the-art semantic segmentation algorithms. By visualizing the intermediate layer activations of the image reconstruction module, we show that the feature map channel could correlate well with the semantic concept, which explains why joint training with the reconstruction task is helpful for the segmentation task. Motivated by our observation, we further proposed a modification to the image reconstruction task, aiming to further disentangle the object clue from the background patterns. From experiment evaluation on various datasets, we show that using reconstruction as auxiliary loss can lead to consistent improvements in various datasets and methods. The proposed method can further lead to significant improvement in object-centric segmentation tasks.
翻译:自动编码旨在通过瓶颈潜在表示重建输入图像,是经典特征表示学习策略之一。已证明它作为半监督学习的辅助任务是有效的,但随着近年来提出了更为复杂的方法,它变得不那么流行。在本文中,我们重新审视了将图像重建作为辅助任务的想法,并将其与现代半监督语义分割框架相结合。令人惊讶的是,我们发现这种半监督学习中的老想法可以产生与最先进的语义分割算法相竞争的结果。通过可视化图像重建模块的中间层激活,我们展示了特征映射通道可以与语义概念很好地相关,这解释了为什么与重建任务的联合训练对分割任务有所帮助。受我们的观察启发,我们进一步提出了对图像重建任务的修改,旨在进一步解开物体线索与背景模式的纠缠。通过对各种数据集的实验证明,使用重建作为辅助损失可以在各种数据集和方法中导致一致的改进。所提出的方法还可以在面向物体的分割任务中产生显着的改进。