Multiplication layers are a key component in various influential neural network modules, including self-attention and hypernetwork layers. In this paper, we investigate the approximation capabilities of deep neural networks with intermediate neurons connected by simple multiplication operations. We consider two classes of target functions: generalized bandlimited functions, which are frequently used to model real-world signals with finite bandwidth, and Sobolev-Type balls, which are embedded in the Sobolev Space $\mathcal{W}^{r,2}$. Our results demonstrate that multiplicative neural networks can approximate these functions with significantly fewer layers and neurons compared to standard ReLU neural networks, with respect to both input dimension and approximation error. These findings suggest that multiplicative gates can outperform standard feed-forward layers and have potential for improving neural network design.
翻译:乘法层是各种有影响的神经网络模块中的一个关键组成部分, 包括自我注意和超网络层。 在本文中, 我们调查深神经网络与通过简单的倍增操作连接的中枢神经元的近似能力。 我们考虑两类目标功能: 通用带宽功能, 通常用来模拟带宽有限的真实世界信号, 以及Sobolev- Type球, 嵌入Sobolev空间 $\mathcal{W ⁇ r2} $。 我们的结果表明, 与标准 ReLU 神经网络相比, 多复制性神经网络可以比这些功能的层和神经小得多, 在输入维度和近似误差方面。 这些发现表明, 倍复制性大门可以超越标准进料向层, 并有可能改进神经网络的设计 。