Most dense recognition approaches bring a separate decision in each particular pixel. These approaches deliver competitive performance in usual closed-set setups. However, important applications in the wild typically require strong performance in presence of outliers. We show that this demanding setup greatly benefit from mask-level predictions, even in the case of non-finetuned baseline models. Moreover, we propose an alternative formulation of dense recognition uncertainty that effectively reduces false positive responses at semantic borders. The proposed formulation produces a further improvement over a very strong baseline and sets the new state of the art in outlier-aware semantic segmentation with and without training on negative data. Our contributions also lead to performance improvement in a recent panoptic setup. In-depth experiments confirm that our approach succeeds due to implicit aggregation of pixel-level cues into mask-level predictions.
翻译:大多数密集识别方法在每个像素上都会进行单独的决策。在通常的封闭设置中,这些方法可以提供竞争性的性能。然而,在实际应用中,通常需要在存在异常值的情况下获得强大的性能。我们展示出,即使在非调优的基线模型的情况下,掩码级别的预测也可以大大受益于应对异常值。此外,我们提出了一种替代密集识别不确定性的公式,它可以在语义边界处有效地减少假阳性响应。所提出的公式在非负数据训练和负数据训练设置下,相对于一个非常强的基线模型产生了更进一步的改进,并在异常感知的语义分割方面设立了新的最佳效果。我们的贡献还导致了最近广义分割设置中的性能改进。深入的实验证实,我们的方法成功的原因在于将像素级线索隐式汇聚到掩码级别的预测中。