Dropout is one of the most popular regularization techniques in neural network training. Because of its power and simplicity of idea, dropout has been analyzed extensively and many variants have been proposed. In this paper, several properties of dropout are discussed in a unified manner from the viewpoint of information geometry. We showed that dropout flattens the model manifold and that their regularization performance depends on the amount of the curvature. Then, we showed that dropout essentially corresponds to a regularization that depends on the Fisher information, and support this result from numerical experiments. Such a theoretical analysis of the technique from a different perspective is expected to greatly assist in the understanding of neural networks, which are still in their infancy.
翻译:辍学是神经网络培训中最流行的正规化技术之一,由于它的力量和思想的简单性,已经对辍学问题进行了广泛的分析,并提出了许多变异方案;在本文中,从信息几何角度以统一的方式讨论了几种辍学特性;我们表明,辍学者平坦了模型的多重功能,他们的正规化表现取决于曲线的大小;然后,我们表明,辍学基本上相当于正规化,取决于渔业者的信息,并且通过数字实验支持这一结果。从不同角度对技术进行理论分析,可望大大有助于了解仍在萌芽中的神经网络。