We propose a new type of neural networks, Kronecker neural networks (KNNs), that form a general framework for neural networks with adaptive activation functions. KNNs employ the Kronecker product, which provides an efficient way of constructing a very wide network while keeping the number of parameters low. Our theoretical analysis reveals that under suitable conditions, KNNs induce a faster decay of the loss than that by the feed-forward networks. This is also empirically verified through a set of computational examples. Furthermore, under certain technical assumptions, we establish global convergence of gradient descent for KNNs. As a specific case, we propose the Rowdy activation function that is designed to get rid of any saturation region by injecting sinusoidal fluctuations, which include trainable parameters. The proposed Rowdy activation function can be employed in any neural network architecture like feed-forward neural networks, Recurrent neural networks, Convolutional neural networks etc. The effectiveness of KNNs with Rowdy activation is demonstrated through various computational experiments including function approximation using feed-forward neural networks, solution inference of partial differential equations using the physics-informed neural networks, and standard deep learning benchmark problems using convolutional and fully-connected neural networks.
翻译:我们提出一种新的神经网络,即Kronecker神经网络(KNNS),它构成具有适应性激活功能的神经网络总框架。Kronecker产品使用Kronecker产品,它提供了建造非常宽的网络的有效方法,同时保持低参数数量。我们的理论分析表明,在适当条件下,KNNS引起的损失衰减速度要快于供养向前网络的衰减速度。这也通过一系列计算实例得到经验的验证。此外,在某些技术假设下,我们为KNNS建立了全球梯度下降趋同。作为一个具体案例,我们提议了rody激活功能,目的是通过注射鼻线性波动消除任何饱和区,其中包括可训练的参数。拟议的Rondy激活功能可以在任何神经网络结构中应用,例如饲料向上神经网络、循环神经网络、革命神经网络等。通过各种计算实验,包括利用供养性向向上神经网络对调的功能进行近比对准,用物理基础网络和完全基础性神经网络进行部分差异式变换等的公式。