Deep neural networks often suffer the data distribution shift between training and testing, and the batch statistics are observed to reflect the shift. In this paper, targeting of alleviating distribution shift in test time, we revisit the batch normalization (BN) in the training process and reveals two key insights benefiting test-time optimization: $(i)$ preserving the same gradient backpropagation form as training, and $(ii)$ using dataset-level statistics for robust optimization and inference. Based on the two insights, we propose a novel test-time BN layer design, GpreBN, which is optimized during testing by minimizing Entropy loss. We verify the effectiveness of our method on two typical settings with distribution shift, i.e., domain generalization and robustness tasks. Our GpreBN significantly improves the test-time performance and achieves the state of the art results.
翻译:深神经网络往往会遭受培训和测试之间数据分布的变化,并观察到批量统计数据反映了这一变化。在本文中,我们重新审视了培训过程中的批次正常化(BN),并揭示了两个有利于测试时间优化的关键见解:$(i) 美元保留与培训相同的梯度背对映表,以及 $(ii) 美元使用数据集级统计数据进行稳健优化和推断。基于这两个洞察,我们提出了一个新的测试时间BN层设计,即GpreBN,在测试期间优化,尽量减少Entropy损失。我们核查了我们的方法在两种典型的分布变化环境中的有效性,即域化和稳健性任务。我们的GpreBN大大改进了测试时间的性能,并实现了艺术结果的状态。