The goal of the present work is to propose a way to modify both the initialization distribution of the weights of a neural network and its activation function, such that all pre-activations are Gaussian. We propose a family of pairs initialization/activation, where the activation functions span a continuum from bounded functions (such as Heaviside or tanh) to the identity function. This work is motivated by the contradiction between existing works dealing with Gaussian pre-activations: on one side, the works in the line of the Neural Tangent Kernels and the Edge of Chaos are assuming it, while on the other side, theoretical and experimental results challenge this hypothesis. The family of pairs initialization/activation we are proposing will help us to answer this hot question: is it desirable to have Gaussian pre-activations in a neural network?
翻译:目前工作的目标是提出一种方法,既修改神经网络重量的初始分布,又修改其激活功能,使所有前期活动都是Gaussian。我们建议建立一个对子初始化/激活的组合,激活功能从封闭功能(如Heaviside或tanh)到身份功能之间的连续体。这项工作的动机是,与Gaussian前活动有关的现有工作之间存在矛盾:一方面,神经凝固核心线和Chaos Edge的工程正在承担,另一方面,理论和实验结果对这个假设提出了质疑。我们提议的对子初始化/激活功能的组合将有助于我们回答这个热点问题:让Gaussian前活动在神经网络中进行是否可取?