Neural networks with random weights appear in a variety of machine learning applications, most prominently as the initialization of many deep learning algorithms and as a computationally cheap alternative to fully learned neural networks. In the present article we enhance the theoretical understanding of random neural nets by addressing the following data separation problem: under what conditions can a random neural network make two classes $\mathcal{X}^-, \mathcal{X}^+ \subset \mathbb{R}^d$ (with positive distance) linearly separable? We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability. Crucially, the number of required neurons is explicitly linked to geometric properties of the underlying sets $\mathcal{X}^-, \mathcal{X}^+$ and their mutual arrangement. This instance-specific viewpoint allows us to overcome the usual curse of dimensionality (exponential width of the layers) in non-pathological situations where the data carries low-complexity structure. We quantify the relevant structure of the data in terms of a novel notion of mutual complexity (based on a localized version of Gaussian mean width), which leads to sound and informative separation guarantees. We connect our result with related lines of work on approximation, memorization, and generalization.
翻译:随机重力神经网络出现在各种机器学习应用中,其中最突出的是许多深层次学习算法的初始化,以及作为完全学习的神经网络的计算廉价替代品。在本条中,我们通过解决以下数据分离问题,提高了对随机神经网的理论理解:随机神经网络在什么条件下可以制造两个等级$mathcal{X ⁇ -,\mathcal{X}{X ⁇ \\\\\\Substset\mathbb{R ⁇ d$(具有正距离)线性分离?我们显示,在数据具有低相容度和统一分布偏差的两层ReLU网络中,足够大的两层ReLU网络可以非常有可能解决这个问题。关键是,所需要的神经网数量与基本各组的几何测量特性特性特性的特性特性性属性明确挂钩。我们用一个基于数据低相容相容度的相容度结构来量化了相关数据在不相近的精确度和精确度结构上的相关数据结构。我们用一个共同的本地化的模型来量化了我们之间对等的精确度的精确性定义。