This study introduces a new normalization layer termed Batch Layer Normalization (BLN) to reduce the problem of internal covariate shift in deep neural network layers. As a combined version of batch and layer normalization, BLN adaptively puts appropriate weight on mini-batch and feature normalization based on the inverse size of mini-batches to normalize the input to a layer during the learning process. It also performs the exact computation with a minor change at inference times, using either mini-batch statistics or population statistics. The decision process to either use statistics of mini-batch or population gives BLN the ability to play a comprehensive role in the hyper-parameter optimization process of models. The key advantage of BLN is the support of the theoretical analysis of being independent of the input data, and its statistical configuration heavily depends on the task performed, the amount of training data, and the size of batches. Test results indicate the application potential of BLN and its faster convergence than batch normalization and layer normalization in both Convolutional and Recurrent Neural Networks. The code of the experiments is publicly available online (https://github.com/A2Amir/Batch-Layer-Normalization).
翻译:这项研究引入了一个新的正常化层,称为批量层正常化(BLN),以减少深神经网络层的内部共变变化问题,作为批量和层正常化的合并版本,BLN根据微量小批量和特性正常化的反向大小,对微型批量和特性正常化给予适当的重视,以使输入在学习过程中的分层正常化;还利用小型批量统计或人口统计,在推论时间稍作改变,进行精确的计算;使用微型批量或人口统计的决定程序,使BLN能够在模型的超分量优化过程中发挥全面作用。BLN的主要优势是支持理论分析,分析是否独立于输入数据,其统计结构在很大程度上取决于所完成的任务、培训数据的数量和批量大小。测试结果表明BLN的应用潜力及其比分批的标准化和分层正常化速度快于Convialalalalal和复发神经网络。实验的代码在网上公开提供(https://github.com/Aatch/Amir)。