We develop a uniform theoretical approach towards the analysis of various neural network connectivity architectures by introducing the notion of a quiver neural network. Inspired by quiver representation theory in mathematics, this approach gives a compact way to capture elaborate data flows in complex network architectures. As an application, we use parameter space symmetries to prove a lossless model compression algorithm for quiver neural networks with certain non-pointwise activations known as rescaling activations. In the case of radial rescaling activations, we prove that training the compressed model with gradient descent is equivalent to training the original model with projected gradient descent.
翻译:我们通过引入快速神经网络概念,为分析各种神经网络连通性结构制定统一的理论方法。在数学中快速表达理论的启发下,这一方法为捕捉复杂网络结构中精密的数据流提供了一个紧凑的方法。作为一种应用,我们使用参数空间对称来证明神经网络的无损模型压缩算法,这些神经网络中的某些非点性活化被称为“调整活化 ” 。在辐射再缩放激活的情况下,我们证明对梯度下沉的压缩模型的培训相当于对预测梯度下沉的原始模型的培训。