We consider neural network approximation spaces that classify functions according to the rate at which they can be approximated (with error measured in $L^p$) by ReLU neural networks with an increasing number of coefficients, subject to bounds on the magnitude of the coefficients and the number of hidden layers. We prove embedding theorems between these spaces for different values of $p$. Furthermore, we derive sharp embeddings of these approximation spaces into H\"older spaces. We find that, analogous to the case of classical function spaces (such as Sobolev spaces, or Besov spaces) it is possible to trade "smoothness" (i.e., approximation rate) for increased integrability. Combined with our earlier results in [arXiv:2104.02746], our embedding theorems imply a somewhat surprising fact related to "learning" functions from a given neural network space based on point samples: if accuracy is measured with respect to the uniform norm, then an optimal "learning" algorithm for reconstructing functions that are well approximable by ReLU neural networks is simply given by piecewise constant interpolation on a tensor product grid.
翻译:我们考虑的是神经网络近似空间,这些空间根据功能的近似率进行分类(误差以美元计),这些功能可以由RELU神经网络以越来越多的系数进行近似(误差以美元计),但受系数大小和隐藏层数的界限限制。我们证明在这些空间之间嵌入了不同值为$p$的理论。此外,我们把这些近似空间的尖锐嵌入H\'older空间。我们发现,与古典功能空间(如Sobolev空间或Besov空间)的情况类似,有可能用“moothness”(即近似率)交换增加的不兼容性。我们早先在[arXiv:2104.027446]中的结果加上我们早先在[arXiv:2104.027.46]中的结果,我们嵌入的这些符号意味着与基于点样本的给定神经网络空间的“学习”功能有关的一个令人惊讶的事实:如果按照统一规范衡量准确性,那么再用一种最优的“学习”算法来重建功能,而ReLU神经网络系统网格网络完全可以适应的功能。