In this paper, we present a spectral-based approach to study the linear approximation of two-layer neural networks. We first consider the case of single neuron and show that the linear approximability, quantified by the Kolmogorov width, is controlled by the eigenvalue decay of an associate kernel. Then, we show that similar results also hold for two-layer neural networks. This spectral-based approach allows us to obtain upper bounds, lower bounds, and explicit hard examples in a united manner. In particular, these bounds imply that for networks activated by smooth functions, restricting the norms of inner-layer weights may significantly impair the expressiveness. By contrast, for non-smooth activation functions, such as ReLU, the network expressiveness is independent of the inner-layer weight norms. In addition, we prove that for a family of non-smooth activation functions, including ReLU, approximating any single neuron with random features suffers from the \emph{curse of dimensionality}. This provides an explicit separation of expressiveness between neural networks and random feature models.
翻译:在本文中,我们提出了一个基于光谱的方法来研究两层神经网络的线性近似。 我们首先考虑单神经元的情况, 并表明由 Kolmogorov 宽度量化的线性近似性受一个关联内核的光值衰减的控制。 然后, 我们显示两层神经网络也有类似的结果。 这种基于光谱的方法使我们能够以统一的方式获得上界、 下界和清晰的硬示例。 特别是, 这些界限意味着, 对于通过光滑功能激活的网络来说, 限制内层重量的规范可能大大损害其表达性。 相反, 对于非移动激活功能, 如 ReLU, 网络的表达性独立于内层重量规范。 此外, 我们证明, 对于非移动激活功能的大家庭, 包括 ReLU, 近似任何具有随机特征的单个神经系, 都受到量度调的功能的影响 。 这就明确区分了神经网络和随机特性模型之间的显性。