In this paper, we prove the effects of the BN operation on the back-propagation of the first and second derivatives of the loss. When we do the Taylor series expansion of the loss function, we prove that the BN operation will block the influence of the first-order term and most influence of the second-order term of the loss. We also find that such a problem is caused by the standardization phase of the BN operation. Experimental results have verified our theoretical conclusions, and we have found that the BN operation significantly affects feature representations in specific tasks, where losses of different samples share similar analytic formulas.
翻译:在本文中,我们证明了BN行动对损失的第一和第二衍生物的后推调整的影响。当我们进行泰勒系列的损失功能扩展时,我们证明BN行动将阻止第一等级的影响和损失第二等级的最大影响。我们还发现,这一问题是由BN行动的标准化阶段造成的。实验结果证实了我们的理论结论,我们发现BN行动对具体任务的特点表现产生了重大影响,不同样品的损失在具体任务中具有类似的分析公式。