Batch Normalization (BN) has become an essential technique in contemporary neural network design, enhancing training stability. Specifically, BN employs centering and scaling operations to standardize features along the batch dimension and uses an affine transformation to recover features. Although standard BN has shown its capability to improve deep neural network training and convergence, it still exhibits inherent limitations in certain cases. Current enhancements to BN typically address only isolated aspects of its mechanism. In this work, we critically examine BN from a feature perspective, identifying feature condensation during BN as a detrimental factor to test performance. To tackle this problem, we propose a two-stage unified framework called Unified Batch Normalization (UBN). In the first stage, we employ a straightforward feature condensation threshold to mitigate condensation effects, thereby preventing improper updates of statistical norms. In the second stage, we unify various normalization variants to boost each component of BN. Our experimental results reveal that UBN significantly enhances performance across different visual backbones and different vision tasks, and notably expedites network training convergence, particularly in early training stages. Notably, our method improved about 3% in accuracy on ImageNet classification and 4% in mean average precision on both Object Detection and Instance Segmentation on COCO dataset, showing the effectiveness of our approach in real-world scenarios.
翻译:暂无翻译