Despite years of research, out-of-domain generalization remains a critical weakness of deep networks for semantic segmentation. Previous studies relied on the assumption of a static model, i.e. once the training process is complete, model parameters remain fixed at test time. In this work, we challenge this premise with a self-adaptive approach for semantic segmentation that adjusts the inference process to each input sample. Self-adaptation operates on two levels. First, it employs a self-supervised loss that customizes the parameters of convolutional layers in the network to the input image. Second, in Batch Normalization layers, self-adaptation approximates the mean and the variance of the entire test data, which is assumed unavailable. It achieves this by interpolating between the training and the reference distribution derived from a single test sample. To empirically analyze our self-adaptive inference strategy, we develop and follow a rigorous evaluation protocol that addresses serious limitations of previous work. Our extensive analysis leads to a surprising conclusion: Using a standard training procedure, self-adaptation significantly outperforms strong baselines and sets new state-of-the-art accuracy on multi-domain benchmarks. Our study suggests that self-adaptive inference may complement the established practice of model regularization at training time for improving deep network generalization to out-of-domain data.
翻译:尽管进行了多年的研究,但外部一般化仍是深层静态分解网络的关键弱点。以前的研究基于静态模型的假设,即一旦培训过程完成,模型参数在测试时间仍保持不变。在这项工作中,我们用一种自我调整的语义分解方法对这一前提提出挑战,这种方法可以调整对每个输入样本的推断过程。自适应在两个层次上运作。首先,它使用一种自监督的损失,将网络中同级层参数定制到输入图像。第二,在批次正常化层中,自适应接近整个测试数据的平均值和差异,而整个测试数据假定是不存在的。通过将培训与从单一测试样本中获得的参考分布进行相互调试算,从而实现这一目标。为了根据经验分析我们的自适应性推断战略,我们制定并遵循一项严格的评价协议,解决先前工作的严重局限性。我们的广泛分析得出了一个令人惊讶的结论:使用标准培训程序,自我调整自适应,大大超越了整个测试数据的平均值,在常规化中提出了我们既定的自我调整基准,并提出了新的状态研究。