Semantic segmentation labels are expensive and time consuming to acquire. Hence, pretraining is commonly used to improve the label-efficiency of segmentation models. Typically, the encoder of a segmentation model is pretrained as a classifier and the decoder is randomly initialized. Here, we argue that random initialization of the decoder can be suboptimal, especially when few labeled examples are available. We propose a decoder pretraining approach based on denoising, which can be combined with supervised pretraining of the encoder. We find that decoder denoising pretraining on the ImageNet dataset strongly outperforms encoder-only supervised pretraining. Despite its simplicity, decoder denoising pretraining achieves state-of-the-art results on label-efficient semantic segmentation and offers considerable gains on the Cityscapes, Pascal Context, and ADE20K datasets.
翻译:因此,预培训通常用来提高分解模型的标签效率。 通常, 分解模型的编码器作为分类器事先经过训练, 解码器是随机初始化的。 这里, 我们争论说, 随机初始化解码器可能是不最理想的, 特别是当几乎没有贴标签的例子时。 我们提议了一种基于分解的解码器预培训方法, 这种方法可以与编码器的监督前培训相结合。 我们发现, 图像网络数据集的解码器在预培训前的解码器大大超越了只受监督的编码器。 尽管其简单化, 解码器在预培训前的解码器在标签高效的分解中取得了最先进的结果, 并在市景、 帕斯卡环境 和 ADE20K 数据集上取得了相当大的收益 。