Vapnik-Chervonenkis (VC) theory has so far been unable to explain the small generalization error of overparametrized neural networks. Indeed, existing applications of VC theory to large networks obtain upper bounds on VC dimension that are proportional to the number of weights, and for a large class of networks, these upper bound are known to be tight. In this work, we focus on a class of partially quantized networks that we refer to as hyperplane arrangement neural networks (HANNs). Using a sample compression analysis, we show that HANNs can have VC dimension significantly smaller than the number of weights, while being highly expressive. In particular, empirical risk minimization over HANNs in the overparametrized regime achieves the minimax rate for classification with Lipschitz posterior class probability. We further demonstrate the expressivity of HANNs empirically. On a panel of 121 UCI datasets, overparametrized HANNs match the performance of state-of-the-art full-precision models.
翻译:迄今为止,Vapnik-Chervonenkis(VC)理论一直无法解释过度平衡神经网络的小型一般错误。事实上,对大型网络的现有应用VC理论在VC尺寸上获得了与重量成比例的上限,对于一大批网络来说,这些上限已知是紧凑的。在这项工作中,我们侧重于一个部分量化的网络类别,我们称之为超机安排神经网络(HANNs ) 。通过抽样压缩分析,我们发现HANNNs的VC尺寸大大小于重量数量,但表现得非常清晰。特别是,过度平衡制度中对HANNs的实验风险最小化达到了与Lipschitz后级概率分类的微缩缩缩缩缩缩率。我们进一步展示了HANNs的经验性。在由121个UCI数据集组成的小组上,过度平衡的HANNs与最先进的全面精确模型的性能相匹配。