In this paper we are concerned with the approximation of functions by single hidden layer neural networks with ReLU activation functions on the unit circle. In particular, we are interested in the case when the number of data-points exceeds the number of nodes. We first study the convergence to equilibrium of the stochastic gradient flow associated with the cost function with a quadratic penalization. Specifically, we prove a Poincar\'e inequality for a penalized version of the cost function with explicit constants that are independent of the data and of the number of nodes. As our penalization biases the weights to be bounded, this leads us to study how well a network with bounded weights can approximate a given function of bounded variation (BV). Our main contribution concerning approximation of BV functions, is a result which we call the localization theorem. Specifically, it states that the expected error of the constrained problem, where the length of the weights are less than $R$, is of order $R^{-1/9}$ with respect to the unconstrained problem (the global optimum). The proof is novel in this topic and is inspired by techniques from regularity theory of elliptic partial differential equations. Finally we quantify the expected value of the global optimum by proving a quantitative version of the universal approximation theorem.
翻译:在本文中,我们关注单层隐蔽神经网络的功能近似与单位圆上的RELU激活功能。 特别是, 我们感兴趣的是, 当数据点数超过节点数时, 我们首先研究与成本函数相关的随机梯度流的趋同性是否趋同, 并带有二次惩罚性。 具体地说, 我们证明, 在成本函数中, 受处罚的版本中, 与数据和节点数无关的明显常数, 具有Poincar\'e的不平等性。 由于我们的惩罚性偏向了要约束的权重, 这导致我们研究一个具有约束性加权数的网络如何能与受约束变异( BV) 的某一功能相近。 我们有关BV 函数近性的主要贡献是我们称之为本地化标定的结果。 具体地, 它指出, 当重量长度低于美元时, 受限的问题的明定常数值与未受限制的问题( 全球最佳度) 相比, 是 $R ⁇ -1/ 9} 。 。 这让我们研究受约束的重重重的网络, 。 。 这个专题中的证据是新颖的, 和最终由常规的精确的理论所激励。